[antlr-interest] Look-ahead problem parsing phrase?
Gavin Lambert
antlr at mirality.co.nz
Sun Jun 28 14:41:09 PDT 2009
At 09:21 29/06/2009, Sean O'Dell wrote:
>Why should lexer rules not refer to other lexer rules without
>being fragments? I've read that doing so only prevented token
>creation. It affects logic, as well?
The moment you have one top-level lexer rule referring to another
top-level rule, you introduce ambiguity -- you're basically
telling the lexer "given this input, produce one of these two
tokens but I don't care which", and then in the parser you're
expecting exactly one of those tokens. Sometimes you'll happen to
pick the right one and it'll parse. Sometimes you
won't. Sometimes the rules are sufficiently different that given
certain input it produces one token and given other input it
produces the other. Then you're basically screwed.
It's important that given any particular input in isolation, there
should be one and only one possible token that can be produced for
it. Doing anything else is just letting yourself in for a world
of pain.
Also, your EOL rule was a top-level lexer rule that can
successfully match zero characters. Doing that creates infinite
loops, and is something else that must be avoided. (Which is
another reason why it should be a parser rule.)
More information about the antlr-interest
mailing list