[antlr-interest] Problem with lexical nondeterminism - ANTLR 2.7.7

Jim Idle jimi at temporal-wave.com
Thu Jan 3 13:40:03 PST 2008



> -----Original Message-----
> From: Gavin Lambert [mailto:antlr at mirality.co.nz]
> Sent: Thursday, January 03, 2008 1:27 PM
> To: clive.i.hill at jpmorgan.com
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Problem with lexical nondeterminism - 
ANTLR
> 2.7.7
> 
> At 10:02 4/01/2008, clive.i.hill at jpmorgan.com wrote:
> >If I try your suggestions with APAC_NUMERIC_TICKER above NUMER I
> >still get the same issue.  Harold your suggestion would work if
> >it matches NUMBER first but it was actually an
> >APAC_NUMERIC_TICKER.  The match actually happens the other way
> >around.
> 
> Ok, what's probably happening here is that ANTLR is discarding the
> predicate because there's only one alt.  I thought it was only
> ANTLR 3 that did that, but I guess v2 did as well.  In that case
> you'll need to combine the rules:
> 
> NUMBER
>      : (INT COMMA) => APAC_NUMERIC_TICKER
> {$setType(APAC_NUMERIC_TICKER);}
>      | (INT COLON) => RANGE               {$setType(RANGE);}
>      | (DOT) => FLOAT                     {$setType(FLOAT);}
>      | (INT DOT) => FLOAT                 {$setType(FLOAT);}
>      | INT                                {$setType(INT);}
>      ;
> 
> protected
> APAC_NUMERIC_TICKER
>      : INT COMMA CHAR CHAR
>      ;
> 

I suggest that combine the common elements though, which will minimize 
predicates (probably to none) and the protected calls:

NUMBER
: INT
    (
         COMMA (whatelse?)	{ $setType(APAC_NUMERIC_TICKER);}
       | DOT INT			{ $setType(FLOAT); }
       | COLON INT            { $setType(RANGE); }
	 |                      { $setType(INT);   }
    )
| DOT 
    (
         INT                  { $setType(FLOAT); }
       |                      { $setType(DOT);   }
    )
;

Combine common roots basically, then branch accordingly. You can often 
avoid predicates this way.

Jim



More information about the antlr-interest mailing list