[antlr-interest] Lexer fails

Douglas Godfrey douglasgodfrey at gmail.com
Fri Jan 27 08:19:46 PST 2012


copy the Number rule from the SQL2003 grammar on he Antlr downloads page.

the Number rule handles fixed and float in 1 rule.

On 1/27/12 2:25 AM, "Gavin Lambert" <antlr at mirality.co.nz> wrote:

>At 14:27 27/01/2012, Peter Piper wrote:
> >I'm sorry that I can only talk about the old stuff (v3) but can
> >anyone see how the following lexer token definition:
> >
> >FLOAT : ('0'..'9')+ ( '.' ('0'..'9')* )? ('E' | 'e') ('-')?
> >('0'..'9')+ ;
>[...]
> >
> >There is no 'e' or 'E' in the input, so why does the ANTLR lexer
>
> >think that this is a "better" token to output than the other one
>
> >I want it to go for, namely:
> >
> >FIXED : ('0'..'9')+ '.' ('0'..'9')* ;
>
>v3 lexers mostly just use single-char lookahead when around
>looping constructs, which isn't sufficient to disambiguate these
>cases.  You need to help it out a bit by providing explicit
>lookahead hints.  (Reportedly v4 is infinitely better at this, but
>I haven't tried it myself yet.)
>
>fragment FLOAT : ('0'..'9')+ ( '.' ('0'..'9')* )? ('E' | 'e')
>('-')? ('0'..'9')+;
>
>FIXED : (FLOAT) => FLOAT { $type = FLOAT; }
>       | ('0'..'9')+ '.' ('0'..'9')*
>       ;
>
>Or left-factor it for more efficiency (and throw an INTEGER in,
>since I assume you have one of those too):
>
>fragment FLOAT : ;
>fragment FIXED : ;
>
>INTEGER : ('0'..'9')+
>         ( ('.' ('0'..'9')) => '.' ('0'..'9')* { $type = FIXED; }
>         ( ('E'|'e') '-'? ('0'..'9')+ { $type = FLOAT; } )? )?
>         ;
>
>Or just call all of these things NUMBERs and sort it out in the
>parser. :)
>
>
>List: http://www.antlr.org/mailman/listinfo/antlr-interest
>Unsubscribe: 
>http://www.antlr.org/mailman/options/antlr-interest/your-email-address




More information about the antlr-interest mailing list