[antlr-interest] Lexer generator bug?

Wed Nov 24 09:07:59 PST 2010

On 11/24/2010 09:55 AM, Arthur Goldberg wrote:
> Hello
> 
> ANTLRWorks can automatically generate
> FLOAT
>      :   ('0'..'9')+ '.' ('0'..'9')* EXPONENT?
>      |   '.' ('0'..'9')+ EXPONENT?
>      |   ('0'..'9')+ EXPONENT
>      ;
> 
> fragment
> EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;
> 
> But I want a FLOAT that doesn't require an exponent:
> FLOAT_NOE2
>      :   '.' ('0'..'9')+
>      | ('0'..'9')+ ('.' ('0'..'9')* )?
>      ;

Is this in addition to the other FLOAT rule???

> It seems that this should recognize any of these: 1.2 .3 4
> But this doesn't recognize 4. I cant find a branch in the generated 
> Lexer that doesn't enter '.' ( '0' .. '9' )*

Do you have an INTEGER rule elsewhere in the grammar that matches
('0'..'9')+ as an integer?

> Also
> 
> FLOAT
>      :   ('0'..'9')+ '.' ('0'..'9')*
>      |   '.' ('0'..'9')+
>      |   ('0'..'9')+
>      ;
> 
> doesn't recognize 4, but I've not examined that lexer.
> 
> Do I misunderstand lexer syntax or is this a Lexer generator bug?

Have you looked at the following example?

> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%2C

It is a good example of combining various lexer rules to help eliminate
ambiguities during lexing, and more importantly, increasing the
efficiency of lexing and producing the right tokens.

You may not need all of it, but it may help you understand why your
attempts aren't doing what you think they are.

> Regards
> Arthur

-- 
Kevin J. Cummings
kjchome at rcn.com
cummings at kjchome.homeip.net
cummings at kjc386.framingham.ma.us
Registered Linux User #1232 (http://counter.li.org)