[antlr-interest] Understanding priorities in lexing (newbie)
antlr at mirality.co.nz
Thu Jul 12 13:11:42 PDT 2007
At 07:46 13/07/2007, Terence Parr wrote:
>Hi Tom. Actually even if I did, OTHER OTHER matches 'ab' as
>does KEYWORD and so it has to resolve the ambiguity, which it
>favor of first rule specified.
The point is that 'ab' *doesn't* match KEYWORD -- except in the
mind of the predictor, since it isn't checking the whole rule. So
an input of 'ab' should unambigously result in OTHER OTHER; an
input of 'abc' *could* result in either OTHER OTHER OTHER or
KEYWORD, but the normal "pick the longest match and/or the first
listed" rules sort out that ambiguity.
In the current implementation, though, the predictor sees 'ab' and
immediately declares "That must be a KEYWORD!" -- even when the
input is actually 'aba', whose only "correct" output would be
OTHER OTHER OTHER. So this results in an exception rather than
producing the right output.
>It uses PROGRAM rule w/o the + because what if you had an error
I'm not sure what you meant by this.
>There is an implied loop to PROGRAM in nextToken() method.
But the predictor doesn't know anything about it -- hence the
This whole thing makes it really hard to write correct lexers --
especially since ANTLR also seems to ignore predicates if it
thinks it knows better. If this one thing was fixed then it'd
make ANTLR significantly easier to use. And I've been saying that
for ages now :)
More information about the antlr-interest