[antlr-interest] Beginner lexing question.

Peter C. Chapin pcc482719 at gmail.com
Mon Aug 4 07:07:24 PDT 2008


Gavin Lambert wrote:

> Now your lexer is ambiguous between T42 and UNARY_OPERATOR -- so on 
> seeing a '*' as input, ANTLR will generate one or the other (depending 
> on the order it sees the rules in) and the other will never happen, 
> which will break your parser rules.
>
> Ideally, when starting out with ANTLR you should avoid composite 
> grammars (or at least avoid using quoted literals in parser rules), 
> since they lead to this kind of trap all too easily.
>
> Probably the best thing to do to resolve this specific problem is to 
> make separate lexer rules for each operator symbol and then change 
> UNARY_OPERATOR into a parser rule.  Another useful rule of thumb is 
> that where ambiguity exists, try to avoid assigning semantic meaning 
> in the lexer.  (Sometimes it can't be avoided due to 
> whitespace-handling issues, but that makes things complicated.)

Thanks for the insight. I understand better what is happening now.  I'll 
take steps to be sure this issue doesn't bite me in the future. Well, I 
hope anyway. :-)

Peter



More information about the antlr-interest mailing list