[antlr-interest] case sensitivity for ANTLR v3 lexers

Martin Probst mail at martin-probst.com
Tue May 16 10:50:28 PDT 2006


> Soon we will need case insensitive lexing for v3.  I am hoping to  
> leave the input stream stuff alone and just subclass Lexer as  
> CaseInsensitiveLexer, which overrides match()
> methods.  Then alter code gen for char set matching (because it's  
> generated inline).
>
> The tokens would have the unmolested input chars.
>
> Does this sound right?

No idea, but did you think about internationalization issues? I mean,  
in European languages there is a clear, defined concept of upper case  
and lower case. However I think there are some asian languages etc  
where this is not exactly true, and java.lang.String#equalsIgnoreCase 
() doesn't get it right as far as I know. Maybe provide an  
overridable (ouch) method of some kind?

Martin



More information about the antlr-interest mailing list