[antlr-interest] solution to lexer issue
Gavin Lambert
antlr at mirality.co.nz
Thu Oct 25 03:40:13 PDT 2007
At 15:30 25/10/2007, Terence Parr wrote:
>Solution is to change my assumption that any char can follow a
>token (some of you don't believe me that is the problem but it
is).
I'm curious, isn't this "any char can follow a token" thing only
true for (a) filter=true or (b) malformed input? Neither of which
ought to be the common case?
And I'm still not sure how assuming that any character at all
could follow the "e" in "one" means that you don't need to test
that the "e" is actually there at all. But I'll take your word
for it.
>NUMBER: ('0'..'9')+ ('.' ('0'..'9')+)?;
>DOT : '.' ;
>
>NUMBER: ('0'..'9')+ ('.' ('0'..'9')+)?;
>OTHER: .;
>
>ONE: 'one';
>TWO: 'two';
>OTHER: .;
[...]
>Note that all three examples are ambiguous. Same input,
>different rules can match.
If all rules have equal precedence, then sure, they're
ambiguous. But I thought the lexer was supposed to have defined
precedence (longest match and/or first listed token
wins)..? Certainly the generated mTokens rule appears to test
them in order...
>I believe that solution will satisfy everyone. Added
>improvement request:
>
>http://www.antlr.org:8888/browse/ANTLR-189
Cool! :)
More information about the antlr-interest
mailing list