[antlr-interest] Solving lexer ambiguities

Jesse McGrew jmcgrew at gmail.com
Wed Sep 12 11:48:20 PDT 2012


Yes, try using a gating semantic predicate to look ahead and only
allow the DIGIT+ option if it sees one or more digits followed by
something that isn't a letter:

DOT ( {yourLookAhead()}?=> DIGIT+ | /* lookahead failed */ {$type = DOT;} )

(It'd be nice if syntactic predicates worked for this, but I don't
think they do.)

Jesse

On Wed, Sep 12, 2012 at 11:39 AM, Jose Juan Tapia <jjtapia at gmail.com> wrote:
> A lookahead I guess.
>
>
> On Wed, Sep 12, 2012 at 2:20 PM, Jose Juan Tapia <jjtapia at gmail.com> wrote:
>
>> Thank you for your suggestion. Unfortunately it still seems to be
>> recognizing the .2 as a float. I was wondering if there was any way to tell
>> the LEXER definition that any structure of the kind
>>
>> DOT DIGIT+
>>
>> should be recognized as a float, but if it has the form
>>
>> DOT DIGIT+ LETTER+, that is a DOT STRING where my STRING definition is
>>
>> STRING: (LETTER | DIGIT | '_')+
>>
>>
>> it is recognized instead as a DOT STRING combination. instead of a FLOAT.
>> Maybe I could be more strict with my STRING definition in some way?
>>
>>
>> On Tue, Sep 11, 2012 at 10:41 PM, John B. Brodie <jbb at acm.org> wrote:
>>
>>> Greetings!
>>>
>>> You might try something like the following --- obviously untested since
>>> you did not provide complete example of your issue:
>>>
>>> FLOAT:
>>>    (DIGIT)+ '.' (DIGIT)* EXPONENT?
>>> | (DIGIT)+ EXPONENT;
>>>
>>>   DOT: '.' ( (DIGIT)+ EXPONENT? {$type=FLOAT;} )? ;
>>>
>>> hopefully in your language the 2structure strings  can never match a
>>> FLOAT.....
>>> (e.g. something like 1structure.2E5.35 isnt permitted....)
>>>
>>> Hope this helps...
>>>     -jbb
>>>
>>> On 09/11/2012 08:45 PM, Jose Juan Tapia wrote:
>>> > So I was gaving a problem with my lexer recognition where my double
>>> token
>>> > is defined as follows.
>>> >
>>> > FLOAT:
>>> >    (DIGIT)+ '.' (DIGIT)* EXPONENT?
>>> > | '.' (DIGIT)+ EXPONENT?
>>> > | (DIGIT)+ EXPONENT;
>>> >
>>> >
>>> > However additional to that I have certain structures where the following
>>> > syntax:
>>> >
>>> > 1structure.2structure .35
>>> >
>>> > Should be recognized by the following grammar
>>> >
>>> > STRING (DOT STRING)? FLOAT
>>> >
>>> > The problem being of course, that my lexer is recognizing the .2 token
>>> as a
>>> > FLOAT and I'm not sure how can I make it so that it choses the
>>> alternative
>>> > solution. (I've tried using  backtracking to no avail. Maybe I'm doing
>>> it
>>> > wrong but my current assumption is that since the ambiguity is at the
>>> lexer
>>> > rather than at the parser level the parser can't do much to solve the
>>> > conflict).
>>>
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe:
>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>
>>
>>
>>
>> --
>> José Juan Tapia Valenzuela
>> Research Associate
>> University of Pittsburgh
>> 3076.1 Biological Sciences Tower 3
>> Pittsburgh, Pa 15260
>>
>
>
>
> --
> José Juan Tapia Valenzuela
> Research Associate
> University of Pittsburgh
> 3076.1 Biological Sciences Tower 3
> Pittsburgh, Pa 15260
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list