[antlr-interest] proper way to handle case insensitive syntax

Sébastien Kirche sebastien.kirche at gmail.com
Thu Jul 14 09:03:22 PDT 2011


Hi,

thanks to the token listing tip, I have found that my problem with
lexer was not token priority but case-sensitivity.
Indeed, PBScript syntax is case insensitive.

Looking in the antlr faq
(http://www.antlr.org/wiki/pages/viewpage.action?pageId=1782), I have
fixed my problem with rewriting my tokens like that :

//Types
Any : A N Y ;
Blob : B L O B ;
Boolean : B O O L E A N ;
...
fragment A:('a'|'A');
fragment B:('b'|'B');
fragment C:('c'|'C');
...
fragment Z:('z'|'Z');

One point I am wondering, is that the same faq solution warns about
performance when a lot of fragments are used.
Considering that in PBScript everything except string literals is case
insensitive, is is still the way to handle case insensitivity ?
In case of the target language is to take into account, I am
prototyping the grammar with java, and when I will get something
functional, I plan to make also a C parser for that language.

Regards.
-- 
Sébastien Kirche


More information about the antlr-interest mailing list