[antlr-interest] How can this identifier from a LR grammar be expressed in ANTLR

Seref Arikan serefarikan at kurumsalteknoloji.com
Thu Jan 12 15:05:04 PST 2012


Greetings,

The following line is from a grammar developed with a LR parser generator
tool:

Identifier = {LetterMinusA}{IdCharMinusT}?{IdChar}* |
'a''t'?(({letter}|'_')*|{LetterMinusT}{Alphanumeric}*)

The elements of the rule are pretty self explanatory, and this grammar is
valid, and is used in production. This bit is supposed to represent
identifiers in a query language. LetterMinusA is Latin Letters except a,
IdChar is simply Alphanumeric | '_' | '.'

I've been trying to get my head around it, but especially the second part
that starts with 'a''t'? looks horribly ambiguous to me. How can antlr know
where the 't' in an input such as 'at' belongs to?.
Again, {letter} | '_' and {LetterMinusT} alternatives would collide. I
won't even ask how the tool that generated this grammar handles it, but if
you have some clues about porting this to Antlr, I'd be more than happy to
hear about them.

Regards
Seref


More information about the antlr-interest mailing list