[antlr-interest] Ignoring case within token definition?

C. Mundi cmundi at gmail.com
Fri May 30 08:24:49 PDT 2008


This is a rhetorical question, not really an answer to the OP's question:

Is case-insensitivity an appropriate (read: orthodox in the canon of
language recognition) role for a lexer?

I am genuinely interested in the opinion of language experts on this
question.  My amateur logic goes like this:

It seems to me that case insensitivity -- if it is to be applied fairly
generally free of context -- is a job for an input filter.  Requiring a
lexer to recognize case equivalence seems like a complication unlikely to
support robust behavior or maintainability of the lexer generator itself or
the resulting lexer.

When I have needed to do this (long ago in a galaxy far away) I simply
filtered the lexer's input stream to convert all non-escaped tokens to lower
case (for lexers designed to recognize lower case).  So quoted strings and
escape sequences are passed through unchanged to the lexer, but everything
else the lexer gets is lower case.  I realize that this is probably not a
Unicode-friendly solution.  :)  But it was a practical engineering solution
for me, and that code is still in production.

cm


On Fri, May 30, 2008 at 4:15 AM, Haralambi Haralambiev <
hharalambiev at gmail.com> wrote:

> Hello,
>
> I was wondering is there a possibility to ignore the letter case when
> matching a token rule.
>
> There are several languages that ignore the case of keywords, so
> I wonder how to match all cases in a beautiful and efficient manner?
>
> The only way I could think of is the following (consider the keyword is
> "keyword"):
>
> Keyword: ('K'|'k') ('E'|'e') ('Y'|'y') ('W'|'w') ('O'|'o') ('R'|'r')
> ('D'|'d');
>
> This, obviously, is not nice to type... I would love if there is some
> option to tell the lexer rule to ignore the case. Is there any?
>
> Best Regards,
> Hari
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080530/710f1c52/attachment.html 


More information about the antlr-interest mailing list