[antlr-interest] case sensitivity for ANTLR v3 lexers
Terence Parr
parrt at cs.usfca.edu
Tue May 16 10:58:07 PDT 2006
On May 16, 2006, at 10:50 AM, Martin Probst wrote:
>> Soon we will need case insensitive lexing for v3. I am hoping to
>> leave the input stream stuff alone and just subclass Lexer as
>> CaseInsensitiveLexer, which overrides match()
>> methods. Then alter code gen for char set matching (because it's
>> generated inline).
>>
>> The tokens would have the unmolested input chars.
>>
>> Does this sound right?
>
> No idea, but did you think about internationalization issues? I
> mean, in European languages there is a clear, defined concept of
> upper case and lower case. However I think there are some asian
> languages etc where this is not exactly true, and
> java.lang.String#equalsIgnoreCase() doesn't get it right as far as
> I know. Maybe provide an overridable (ouch) method of some kind?
If I override match(char c) so that it uses Character.toUpperCase()
or whatever, it should be ok I think.
Ter
More information about the antlr-interest
mailing list