[antlr-interest] Lexing problem I cannot resolve

Raphael Reitzig r_reitzi at cs.uni-kl.de
Sun Aug 3 04:21:07 PDT 2008


Hi again!

You are probably right and may consider Gavin's response.

But do I understand correctly that in your language '..5' is a valid  
range? What range is that? I only had 'INT..INT' in mind and would  
create a single token if it.

Consider the following:

INT : (0|1|2|3|4|5|6|7|8|9)+;
THREE_DOTS : '...';
TWO_DOTS : '..';
ONE_DOT : '.';

numerical construct :
   a=INT  THREE_DOTS     -> ^(ELLIPSIS $a)
| a=INT  TWO_DOTS b=INT -> ^(RANGE $a $b
| a=INT? ONE_DOT b=INT  -> ^(FLOAT ($a + $b))
| a=INT                 -> ^(INTEGER $a);

I think that may work; more experienced list members will have a say,  
I suppose. In particular, I am not sure about the float rewrite rule.  
You can put two integers as children and deal with conversion to float  
in your target language if it fails.

Regards

Raphael

"Carter Cheng" <carter_cheng at yahoo.com> wrote (Sun Aug  3 13:01:37 2008):

> Thanks for the reply. I think that will only disambiguate between  
> the .2 and .. cases and not the example I am describing in this case.
>
> The problem is the entry point into the FSA would be the leading  
> digit and therefore the range rule will not be considered at all.  
> The only thing I can think of but I am not sure how to state it in  
> ANTLR is to use the syntax predicates and do something as follows.
>
> digit+ '...'=> (return an int) /* int followed by ellipsis */
> digit+ '..' => (return an int) /* int followed by range */
> digit+ '.' => (possible float value) /* float or error */
>
> Or is this wrong?
>
> Regards,
>
> Carter.
>
>
> --- On Sun, 8/3/08, Raphael Reitzig <r_reitzi at cs.uni-kl.de> wrote:
>
>> From: Raphael Reitzig <r_reitzi at cs.uni-kl.de>
>> Subject: Re: [antlr-interest] Lexing problem I cannot resolve
>> To: antlr-interest at antlr.org
>> Date: Sunday, August 3, 2008, 3:41 AM
>> Hi Carter!
>>
>> Moving range rule above float rule should do the job. ANTLR
>> chooses
>> the first matching rule it discovers, testing bottom down.
>>
>> Regards
>>
>> Raphael
>>
>> "Carter Cheng" <carter_cheng at yahoo.com>
>> wrote (Sun Aug  3 12:16:38 2008):
>>
>> > Hi,
>> >
>> > Yet another beginner's question. I have been
>> having difficulties
>> > with a lexing ambiguity and I am curious how one would
>> go about
>> > resolving it with ANTLR. The problem I am having is
>> follows. I have
>> > a grammar with a standard C like INT FLOAT lexing
>> rules but I also
>> > have the ellipsis ... and range .. tokens in the
>> grammar. The
>> > difficulty I am having is with this instance string:
>> >
>> > 1..2
>> >
>> > Which the lexer seems to like to lex as two FLOATS as
>> oppose to as
>> > INT RANGE INT. In the language in question FLOAT FLOAT
>> is illegal
>> > but obviously the lexer cannot know that. Is there a
>> way to resolve
>> > this in ANTLR cleanly?
>> >
>> > Thanks in advance,
>> >
>> > Carter.
>> >
>> >
>> >
>> >
>>
>>
>>
>> ----------------------------------------------------------------
>> This message was sent using IMP, the Internet Messaging
>> Program.
>
>
>
>



----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: PGP Digital Signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20080803/3de62370/attachment.bin 


More information about the antlr-interest mailing list