[antlr-interest] Why does the unused rule effect parser behaviour?

Gavin Lambert antlr at mirality.co.nz
Tue Jan 10 03:41:20 PST 2012


At 23:59 10/01/2012, Seref Arikan wrote:
 >Ok, for anyone else who encounters the same thing:
 >When I use characters directly in parser rules such as 'a', they 

 >end up as tokens. Even though 'a' is a character that is 
normally
 >covered by lower case token, it exists on its own, and parser
 >matches it, providing an unexpected token type for the rule that 

 >is trying to use lower case token.
 >Lesson learned: do not use characters in parser rules, use
 >tokens..

Yep.  And never use token ranges, either, since it will match on 
the numeric value of the tokens, which aren't guaranteed to be in 
any particular order or to remain in that order.


I actually consider the literal token thing in parser rules to be 
a misfeature -- it's a little tidier for toy grammars but for 
anything serious (eg. anything that wants decent error messages) 
it's more harmful than helpful, and it's far too easy to get 
things wrong, as you discovered.



More information about the antlr-interest mailing list