[antlr-interest] Lexer fails
Douglas Godfrey
douglasgodfrey at gmail.com
Fri Jan 27 08:19:46 PST 2012
copy the Number rule from the SQL2003 grammar on he Antlr downloads page.
the Number rule handles fixed and float in 1 rule.
On 1/27/12 2:25 AM, "Gavin Lambert" <antlr at mirality.co.nz> wrote:
>At 14:27 27/01/2012, Peter Piper wrote:
> >I'm sorry that I can only talk about the old stuff (v3) but can
> >anyone see how the following lexer token definition:
> >
> >FLOAT : ('0'..'9')+ ( '.' ('0'..'9')* )? ('E' | 'e') ('-')?
> >('0'..'9')+ ;
>[...]
> >
> >There is no 'e' or 'E' in the input, so why does the ANTLR lexer
>
> >think that this is a "better" token to output than the other one
>
> >I want it to go for, namely:
> >
> >FIXED : ('0'..'9')+ '.' ('0'..'9')* ;
>
>v3 lexers mostly just use single-char lookahead when around
>looping constructs, which isn't sufficient to disambiguate these
>cases. You need to help it out a bit by providing explicit
>lookahead hints. (Reportedly v4 is infinitely better at this, but
>I haven't tried it myself yet.)
>
>fragment FLOAT : ('0'..'9')+ ( '.' ('0'..'9')* )? ('E' | 'e')
>('-')? ('0'..'9')+;
>
>FIXED : (FLOAT) => FLOAT { $type = FLOAT; }
> | ('0'..'9')+ '.' ('0'..'9')*
> ;
>
>Or left-factor it for more efficiency (and throw an INTEGER in,
>since I assume you have one of those too):
>
>fragment FLOAT : ;
>fragment FIXED : ;
>
>INTEGER : ('0'..'9')+
> ( ('.' ('0'..'9')) => '.' ('0'..'9')* { $type = FIXED; }
> ( ('E'|'e') '-'? ('0'..'9')+ { $type = FLOAT; } )? )?
> ;
>
>Or just call all of these things NUMBERs and sort it out in the
>parser. :)
>
>
>List: http://www.antlr.org/mailman/listinfo/antlr-interest
>Unsubscribe:
>http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list