[antlr-interest] Solving lexer ambiguities

Jim Idle jimi at temporal-wave.com
Wed Sep 12 13:38:38 PDT 2012


Start here:

http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%2C+dot%2C+range%2C+time+specs

It should enable you to work out a solution.

Jim




> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Jose Juan Tapia
> Sent: Wednesday, September 12, 2012 11:21 AM
> To: John B. Brodie
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Solving lexer ambiguities
>
> Thank you for your suggestion. Unfortunately it still seems to be
> recognizing the .2 as a float. I was wondering if there was any way to
> tell the LEXER definition that any structure of the kind
>
> DOT DIGIT+
>
> should be recognized as a float, but if it has the form
>
> DOT DIGIT+ LETTER+, that is a DOT STRING where my STRING definition is
>
> STRING: (LETTER | DIGIT | '_')+
>
>
> it is recognized instead as a DOT STRING combination. instead of a
> FLOAT.
> Maybe I could be more strict with my STRING definition in some way?
>
>
> On Tue, Sep 11, 2012 at 10:41 PM, John B. Brodie <jbb at acm.org> wrote:
>
> > Greetings!
> >
> > You might try something like the following --- obviously untested
> > since you did not provide complete example of your issue:
> >
> > FLOAT:
> >    (DIGIT)+ '.' (DIGIT)* EXPONENT?
> > | (DIGIT)+ EXPONENT;
> >
> >   DOT: '.' ( (DIGIT)+ EXPONENT? {$type=FLOAT;} )? ;
> >
> > hopefully in your language the 2structure strings  can never match a
> > FLOAT.....
> > (e.g. something like 1structure.2E5.35 isnt permitted....)
> >
> > Hope this helps...
> >     -jbb
> >
> > On 09/11/2012 08:45 PM, Jose Juan Tapia wrote:
> > > So I was gaving a problem with my lexer recognition where my double
> > > token is defined as follows.
> > >
> > > FLOAT:
> > >    (DIGIT)+ '.' (DIGIT)* EXPONENT?
> > > | '.' (DIGIT)+ EXPONENT?
> > > | (DIGIT)+ EXPONENT;
> > >
> > >
> > > However additional to that I have certain structures where the
> > > following
> > > syntax:
> > >
> > > 1structure.2structure .35
> > >
> > > Should be recognized by the following grammar
> > >
> > > STRING (DOT STRING)? FLOAT
> > >
> > > The problem being of course, that my lexer is recognizing the .2
> > > token
> > as a
> > > FLOAT and I'm not sure how can I make it so that it choses the
> > alternative
> > > solution. (I've tried using  backtracking to no avail. Maybe I'm
> > > doing it wrong but my current assumption is that since the
> ambiguity
> > > is at the
> > lexer
> > > rather than at the parser level the parser can't do much to solve
> > > the conflict).
> >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> address
> >
>
>
>
> --
> José Juan Tapia Valenzuela
> Research Associate
> University of Pittsburgh
> 3076.1 Biological Sciences Tower 3
> Pittsburgh, Pa 15260
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list