[antlr-interest] newbie faces unexpected tokens

Jörg Rathlev joerg at jrsoftware.de
Wed Jul 12 03:29:59 PDT 2006


Hi Mo,

> 	NUMBER options {testLiterals=true;}
> 	  : ('0'..'9')+ ('.' ('0'..'9')*)? | '.' ('0'..'9')+;

I'm just guessing, but maybe the lexer tries to read ".bar" as a NUMBER
token, not as a DOT followed by an IDENT. The lookahead sets of those
two alternatives should be different, but maybe they are not due to
Antlr's linear approximate lookahead. Do you get any nondeterminism
warnings?

You can probably use syntactic predicates to solve this problem, or you
could do something like the JavaLexer in java.g does, that is, rewrite
the second alternative of your number rule to something like the
following and remove the DOT rule, so that a DOT token will be
recognized inside the NUMBER rule:

  '.' {_ttype = DOT;} ('0'..'9' {_ttype = NUMBER;} )+


Cheers
Joerg



More information about the antlr-interest mailing list