[antlr-interest] case sensitivity for ANTLR v3 lexers
Martin Probst
mail at martin-probst.com
Tue May 16 10:50:28 PDT 2006
> Soon we will need case insensitive lexing for v3. I am hoping to
> leave the input stream stuff alone and just subclass Lexer as
> CaseInsensitiveLexer, which overrides match()
> methods. Then alter code gen for char set matching (because it's
> generated inline).
>
> The tokens would have the unmolested input chars.
>
> Does this sound right?
No idea, but did you think about internationalization issues? I mean,
in European languages there is a clear, defined concept of upper case
and lower case. However I think there are some asian languages etc
where this is not exactly true, and java.lang.String#equalsIgnoreCase
() doesn't get it right as far as I know. Maybe provide an
overridable (ouch) method of some kind?
Martin
More information about the antlr-interest
mailing list