[antlr-interest] [Antlr 3] lexer problem

Gavin Lambert antlr at mirality.co.nz
Wed May 16 05:11:59 PDT 2007


At 20:57 15/05/2007, Sven Efftinge wrote:
 >Yes, I know. But I'd expect that the lexer tracks back when it
 >can not complete the optional ('.' DIGIT)? part.
 >So it just consumes 42 (because it is a valid FLOAT, too).
 >The parser behaves like this, but the lexer not.
 >I'd expected that the following two grammars would successfully
 >parse '42.foo'

Yeah, I have to say I agree with that one.  Slight variation:

lexer grammar Test;
DOT : '.';
TEXT :	('a'..'z')+;
FLOAT : DIGIT (DOT DIGIT)?;
fragment DIGIT : ('0'..'9')+;

I expect the output of the lexer to be FLOAT(42) DOT(.) TEXT(foo).

But it's not.  Instead it outputs an error:
   1:3 required (...)+ loop did not match anything at character 
'f'
and then outputs TEXT(oo).  This makes no sense to me, since the 
loop is *not* required, as it's being called from an optional 
block.  Any failure to match within that block should 
auto-rollback both with or without backtracking enabled, and 
proceed exactly as it would if the optional block were missing 
from the grammar.  (Currently it doesn't seem to do any sort of 
rollback at all, even with backtracking on.)

No syntactic predicates should be required here.  That doesn't 
seem like a good solution to this problem.  (If ANTLR internally 
needs the predicates for some reason, it should generate them 
itself, since it's basically just restating what the grammar 
already said.)  In fact I can't think of any case where syntactic 
predicates should actually be needed to be specified explicitly in 
the lexer.

(Note that the above output was tested under b7; it's possible 
it's been corrected since without me noticing.)



More information about the antlr-interest mailing list