[antlr-interest] lexer: matching float vs int

Thomas Brandon tbrandonau at gmail.com
Tue Sep 9 10:27:47 PDT 2008


Ah, you are using ANTLR 2.x. Unless there is a particularly compelling
reason you would be best to move to ANTLR 3.1. They are quite
different so if you stick with 2.x you should make this clear in any
emails and should be careful to restrict yourself to information on
2.x.
It's been awhile but I imagine your problem is insufficient lookahead,
I believe you should have got a message to this effect when running
ANTLR. Try combining the int and float rules as in John Brodie's reply
(though adapting to ANTLR 2.x syntax of course).

Tom.

On Wed, Sep 10, 2008 at 3:02 AM, Olya Krachina <okrachin at purdue.edu> wrote:
> Quoting Thomas Brandon <tbrandonau at gmail.com>:
>
>> On Tue, Sep 9, 2008 at 2:01 PM, Olya Krachina <okrachin at purdue.edu> wrote:
>> > Hello,
>> > I am new to antlr and i seem to be stuck on this.
>> > i need to have 2 datatypes defined: int and float, currently i have them
>> defined
>> > like this in my .g file:
>> >
>> > INT:      ('0'..'9')+;
>> > FLOAT:    ('0'..'9')*('.')('0'..'9')+ ;
>> >
>> > So, this does not work, when it comes across an int i think it tries to
>> match
>> > the longest string, i.e. float but finds space instead of '.' (since its an
>> int)
>> > and bails out.
>> >
>> > ps: i know this is more a regexp question, but if someone could help out,
>> I
>> > would greatly appreciate it.
>> >
>> > thanks
>> Those two rules work fine for me. I think either other rules are
>> interfering or you are providing invalid input to the grammar. Try
>> making a minimal grammar that reproduces the problem and post that
>> along with the exact input that fails.
>>
>> Tom.
>>
>
> this is the input file:
>
> BEGIN
> PROGRAM
> END
> PROTO
> 34
> 89967
> END
>
> this is the rules so far:
>
> ASSIGN                  : ":=" ;
> COMMA                   : ',' ;
> PLUS                    : '+' ;
> MINUS                   : '-' ;
> STAR                    : '*' ;
> SLASH                   : '/' ;
> EQUAL                   : '=' ;
> LESS                    : '<' ;
> GRT                     : '>' ;
> LPAREN                  : '(' ;
> RPAREN                  : ')' ;
> SEMICOL                 : ';' ;
>
>
> Whitespace
>        :       ( ( '\003'..'\010' | '\t' | '\013' | '\f' | '\016'.. '\037' |
> '\177'..'\377' | ' ' )
>                | "\r\n"                { newline(); }
>                | ( '\n' | '\r' )       { newline(); }
>                )                       { _ttype = Token.SKIP;  }
>        ;
>
> IDENTIFIER
>        :        ( 'a'..'z' | 'A'..'Z' )
>                                ( 'a'..'z' | 'A'..'Z' | '0'..'9' )*
>                ;
>
> INT:             ('0'..'9')+ ;
> FLOAT:          (INT)? ('.') (INT);
>
> if only ONE (either INT or FLOAT) is defined in .g, there are no errors, but
> with both of them defined i run into problem.
> thanks again.
>
>


More information about the antlr-interest mailing list