[antlr-interest] V3 lexer behaviour clarifications
Gavin Lambert
antlr at mirality.co.nz
Sat Mar 31 15:09:28 PDT 2007
Just trying to get my head around some of the differences between
lexer and parser (in V3). Am I correct in assuming that the lexer
doesn't get any of the cool new LL(*) lookahead and backtracking
that's available to the parser?
Because logically, if I've got two lexer rules like so:
FLOAT : INT '.' INT;
INT : ('0'..'9')+;
There's obviously ambiguity between them, but I would expect it to
try matching as a FLOAT first (since I listed it first) and only
if that fails should it return an INT and then try lexing whatever
comes after it as a separate token.
Trying a similar grammar to the above (not the exact grammar
above, though), however, that's not what seems to be
happening. It just reports an error and then treats it as an
INT. The only way I can get it to do the behaviour I want is to
make a composite rule with predicates and explicit token-type
changing code, which seems ugly.
Is this normal for now? If so, will it be improved in the
future? Or am I just doing something stupid?
More information about the antlr-interest
mailing list