[antlr-interest] Beginner lexing question.
Peter C. Chapin
pcc482719 at gmail.com
Mon Aug 4 07:07:24 PDT 2008
Gavin Lambert wrote:
> Now your lexer is ambiguous between T42 and UNARY_OPERATOR -- so on
> seeing a '*' as input, ANTLR will generate one or the other (depending
> on the order it sees the rules in) and the other will never happen,
> which will break your parser rules.
> Ideally, when starting out with ANTLR you should avoid composite
> grammars (or at least avoid using quoted literals in parser rules),
> since they lead to this kind of trap all too easily.
> Probably the best thing to do to resolve this specific problem is to
> make separate lexer rules for each operator symbol and then change
> UNARY_OPERATOR into a parser rule. Another useful rule of thumb is
> that where ambiguity exists, try to avoid assigning semantic meaning
> in the lexer. (Sometimes it can't be avoided due to
> whitespace-handling issues, but that makes things complicated.)
Thanks for the insight. I understand better what is happening now. I'll
take steps to be sure this issue doesn't bite me in the future. Well, I
hope anyway. :-)
More information about the antlr-interest