[antlr-interest] Solving lexer ambiguities

Jose Juan Tapia jjtapia at gmail.com
Wed Sep 12 18:06:33 PDT 2012


Thank you. I think I got it. For completeness sake I'll post my solution
@members{
 public boolean floatLA(){
    int counter = 1;
    while(true){
      int LA8_0 = input.LT(counter);
      if ((LA8_0>='0' && LA8_0<='9')){
        counter++;
      }
      else{
        break;
      }
    }
    int LA14_0 = input.LT(counter);
    if((LA14_0>='A' && LA14_0<='Z')||LA14_0=='_'||(LA14_0>='a' &&
LA14_0<='z'))
      return false;
    return true;
 }
}


DOT : '.'   ({floatLA()}? => ((DIGIT)+ EXPONENT?  {$type=FLOAT;}) |
{$type=DOT;});


On Wed, Sep 12, 2012 at 4:38 PM, Jim Idle <jimi at temporal-wave.com> wrote:

> Start here:
>
>
> http://www.antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%2C+dot%2C+range%2C+time+specs
>
> It should enable you to work out a solution.
>
> Jim
>
>
>
>
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Jose Juan Tapia
> > Sent: Wednesday, September 12, 2012 11:21 AM
> > To: John B. Brodie
> > Cc: antlr-interest at antlr.org
> > Subject: Re: [antlr-interest] Solving lexer ambiguities
> >
> > Thank you for your suggestion. Unfortunately it still seems to be
> > recognizing the .2 as a float. I was wondering if there was any way to
> > tell the LEXER definition that any structure of the kind
> >
> > DOT DIGIT+
> >
> > should be recognized as a float, but if it has the form
> >
> > DOT DIGIT+ LETTER+, that is a DOT STRING where my STRING definition is
> >
> > STRING: (LETTER | DIGIT | '_')+
> >
> >
> > it is recognized instead as a DOT STRING combination. instead of a
> > FLOAT.
> > Maybe I could be more strict with my STRING definition in some way?
> >
> >
> > On Tue, Sep 11, 2012 at 10:41 PM, John B. Brodie <jbb at acm.org> wrote:
> >
> > > Greetings!
> > >
> > > You might try something like the following --- obviously untested
> > > since you did not provide complete example of your issue:
> > >
> > > FLOAT:
> > >    (DIGIT)+ '.' (DIGIT)* EXPONENT?
> > > | (DIGIT)+ EXPONENT;
> > >
> > >   DOT: '.' ( (DIGIT)+ EXPONENT? {$type=FLOAT;} )? ;
> > >
> > > hopefully in your language the 2structure strings  can never match a
> > > FLOAT.....
> > > (e.g. something like 1structure.2E5.35 isnt permitted....)
> > >
> > > Hope this helps...
> > >     -jbb
> > >
> > > On 09/11/2012 08:45 PM, Jose Juan Tapia wrote:
> > > > So I was gaving a problem with my lexer recognition where my double
> > > > token is defined as follows.
> > > >
> > > > FLOAT:
> > > >    (DIGIT)+ '.' (DIGIT)* EXPONENT?
> > > > | '.' (DIGIT)+ EXPONENT?
> > > > | (DIGIT)+ EXPONENT;
> > > >
> > > >
> > > > However additional to that I have certain structures where the
> > > > following
> > > > syntax:
> > > >
> > > > 1structure.2structure .35
> > > >
> > > > Should be recognized by the following grammar
> > > >
> > > > STRING (DOT STRING)? FLOAT
> > > >
> > > > The problem being of course, that my lexer is recognizing the .2
> > > > token
> > > as a
> > > > FLOAT and I'm not sure how can I make it so that it choses the
> > > alternative
> > > > solution. (I've tried using  backtracking to no avail. Maybe I'm
> > > > doing it wrong but my current assumption is that since the
> > ambiguity
> > > > is at the
> > > lexer
> > > > rather than at the parser level the parser can't do much to solve
> > > > the conflict).
> > >
> > >
> > > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > > Unsubscribe:
> > > http://www.antlr.org/mailman/options/antlr-interest/your-email-
> > address
> > >
> >
> >
> >
> > --
> > José Juan Tapia Valenzuela
> > Research Associate
> > University of Pittsburgh
> > 3076.1 Biological Sciences Tower 3
> > Pittsburgh, Pa 15260
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> > email-address
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>



-- 
José Juan Tapia Valenzuela
Research Associate
University of Pittsburgh
3076.1 Biological Sciences Tower 3
Pittsburgh, Pa 15260


More information about the antlr-interest mailing list