[antlr-interest] proper way to handle case insensitive syntax

Jim Idle jimi at temporal-wave.com
Thu Jul 14 09:34:10 PDT 2011


I would not use that method. Override the LA() and make it return upper
case only, then specify the keywords in upper case. Then you reduce the
complexity of the keyword match greatly and remove lots of fragment method
calls. If you are using the C target with only standard ASCII keywords,
then there is a built in method to use upper case matching in the LA
function.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Sébastien Kirche
> Sent: Thursday, July 14, 2011 9:03 AM
> To: antlr-interest
> Subject: [antlr-interest] proper way to handle case insensitive syntax
>
> Hi,
>
> thanks to the token listing tip, I have found that my problem with
> lexer was not token priority but case-sensitivity.
> Indeed, PBScript syntax is case insensitive.
>
> Looking in the antlr faq
> (http://www.antlr.org/wiki/pages/viewpage.action?pageId=1782), I have
> fixed my problem with rewriting my tokens like that :
>
> //Types
> Any : A N Y ;
> Blob : B L O B ;
> Boolean : B O O L E A N ;
> ...
> fragment A:('a'|'A');
> fragment B:('b'|'B');
> fragment C:('c'|'C');
> ...
> fragment Z:('z'|'Z');
>
> One point I am wondering, is that the same faq solution warns about
> performance when a lot of fragments are used.
> Considering that in PBScript everything except string literals is case
> insensitive, is is still the way to handle case insensitivity ?
> In case of the target language is to take into account, I am
> prototyping the grammar with java, and when I will get something
> functional, I plan to make also a C parser for that language.
>
> Regards.
> --
> Sébastien Kirche
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list