[antlr-interest] V3 lexer behaviour clarifications

Terence Parr parrt at cs.usfca.edu
Sat Mar 31 15:42:26 PDT 2007


On Mar 31, 2007, at 3:09 PM, Gavin Lambert wrote:

> Just trying to get my head around some of the differences between  
> lexer and parser (in V3).  Am I correct in assuming that the lexer  
> doesn't get any of the cool new LL(*) lookahead and backtracking  
> that's available to the parser?

ANTLR does exactly same thing for lexer, parser, and tree parser.

> Because logically, if I've got two lexer rules like so:
>
> FLOAT : INT '.' INT;
> INT : ('0'..'9')+;
>
> There's obviously ambiguity between them, but I would expect it to  
> try matching as a FLOAT first (since I listed it first) and only if  
> that fails should it return an INT and then try lexing whatever  
> comes after it as a separate token.

Should be no ambiguity.  The '.' resolves things.

> Trying a similar grammar to the above (not the exact grammar above,  
> though), however, that's not what seems to be happening.  It just  
> reports an error and then treats it as an INT.  The only way I can  
> get it to do the behaviour I want is to make a composite rule with  
> predicates and explicit token-type changing code, which seems ugly.
>
> Is this normal for now?  If so, will it be improved in the future?   
> Or am I just doing something stupid?

ANTLR correctly generates DFA predictor for me:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: T_dec-2.png
Type: image/png
Size: 14731 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20070331/384aea4a/attachment-0001.png 
-------------- next part --------------

That says to predict alt 1, FLOAT, if it sees a '.' after the INT  
else do alt 2 (INT).

I ran:

/tmp/antlr3 $ java org.antlr.Tool -dfa T.g
ANTLR Parser Generator  Version 3.0b7 (??, 2007)  1989-2007
/tmp/antlr3 $ open -a graphviz T_dec-2.dot

on:

lexer grammar T;
FLOAT : INT '.' INT;
INT : ('0'..'9')+;

:)

Ain't that slick?  Must be something else going on with your grammar.

Ter



More information about the antlr-interest mailing list