[antlr-interest] Resolving ambiguities in Lexer rules

David-Sarah Hopwood david-sarah at jacaranda.org
Sun Aug 16 07:45:54 PDT 2009


Achint Mehta wrote:
> 2. The second option is that all the tokens have to given as alternate
> rules/token with SPECIAL_STRING. Again, in a big/complicated parser, all the
> tokens in the whole parser have to be repeated where-ever I intend to use
> the SPECIAL_STRING. This can be simplified if I give the tokens in the
> definition of SPECIAL_STRING iteself. But still in a parser which could use
> tens or hundreds of tokens, it would seem to be impractical to repeat all
> the tokens in SPECIAL_STRING rule and other similar rules (intended for
> collecting the generic string).

You only need one rule that collects all of the tokens, which can then be
included as an alternative in SPECIAL_STRING and similar rules. This is
the approach I use for ECMAScript 5 (which has some contexts in which
keywords are treated like identifiers). The duplication is mildly
irritating, but it still works quite well in practice.

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com



More information about the antlr-interest mailing list