[antlr-interest] Lexing problem I cannot resolve

Thomas Brandon tbrandonau at gmail.com
Wed Aug 6 19:27:49 PDT 2008


On Thu, Aug 7, 2008 at 6:09 AM, Gavin Lambert <antlr at mirality.co.nz> wrote:
> At 07:07 7/08/2008, Carter Cheng wrote:
>>I tried this variant too but it does not seem to parse
>>correctly(see attached). It still thinks that the 1. is a FLOAT
>>token. Perhaps I have run afoul of some bug in 3.0.1?
>
> I did say that it would do that, since that's what you seemed to be wanting.
>
> "1" should be an INT, "1." should be a FLOAT, ".2" should be a FLOAT, "1.2"
> should be a FLOAT, and "1..2" should be an INT RANGE INT.
>
> If you want to disallow "1." as a FLOAT, then you need to change the DIGIT*
> to a DIGIT+ as I originally suggested; though you might also need to add
> additional lookahead.
>
>
I think he means regardless of what follows, 1. forces ANTLR onto the
FLOAT path. So 1..0 matches 1. as a FLOAT and then errors.
The portion ( (DOTDOT) => | ( '.' DIGIT* { $type = FLOAT; } )? )
produces the code like (under 3.1b2 and similar under 3.0.1):
if ( (LA4_0=='.') ) {
    alt4=2;
}
else if ( (synpred1_test()) ) {
    alt4=1;
}
I gather due to ANTLR trying to only inserts predicates when there is
syntactic ambiguity. Changing it to ( ('.' ~'.')=> '.' DIGIT*
{$type=FLOAT;} )? fixes this. Or ('.' DIGIT)=> '.' DIGIT+
{$type=FLOAT;} )? if you don' want to allow 1. as a FLOAT.
IIRC there were some changes to synpred handling post b2 so this might
be fixed in the latest snapshot.

Tom.


More information about the antlr-interest mailing list