[antlr-interest] Lexing problem I cannot resolve

Gavin Lambert antlr at mirality.co.nz
Sun Aug 3 03:50:24 PDT 2008


At 22:16 3/08/2008, Carter Cheng wrote:
 >1..2
 >
 >Which the lexer seems to like to lex as two FLOATS as oppose to 
as
 >INT RANGE INT. In the language in question FLOAT FLOAT is 
illegal
 >but obviously the lexer cannot know that. Is there a way to 
resolve
 >this in ANTLR cleanly?

Presumably it's splitting it up into FLOAT["1."] FLOAT[".2"]?

For starters, you could declare the former one to be an illegal 
FLOAT -- after all it's a bit odd to have a trailing period with 
no following digits.

But whether you choose to make that illegal or not (and you don't 
*have* to), you'll need to modify the FLOAT rule to look ahead, 
see two periods, and exit without matching either.

Something along these lines ought to do the trick:

fragment DIGIT: '0'..'9';
RANGE: '..';
INT
   : DIGIT+
     ( ('..') => /* RANGE; ignore */
     | '.' DIGIT* { $type = FLOAT; }
     )?
   | ('.' DIGIT) => '.' DIGIT+ { $type = FLOAT; }
   ;

(If you want to make "1." illegal, then changing DIGIT* to DIGIT+ 
on the sixth line ought to do the trick.)

You *might* need to merge the RANGE rule into the INT rule as 
well, but I think the above will work ok as is.



More information about the antlr-interest mailing list