[antlr-interest] Ambiguous lexing task

Daniels, Troy (US SSA) troy.daniels at baesystems.com
Fri Apr 2 14:56:01 PDT 2010


 

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org 
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Cliff Hudson
> Sent: Friday, April 02, 2010 4:59 PM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Ambiguous lexing task
> 
> I've played around with it a bit, and I modified NAMECHAR to be:
> 
> fragment NAMECHAR
>     : LETTER
>     | DIGIT
>     | '_'
>     | {input.LA(2) != '>'}?=> '-'
>     ;
> 
> This seems to do the trick.  However, I'm concerned this is 
> not a best practice for this kind of situation.  Could I get 
> a suggestion as to the "correct" way to go about this?
> 

Is it every possible that that text should be interpreted as

my-identifier-  >  foo

(That is, my-identifier- "greater than" foo?) If it is, then the language is ambiguous to the lexer and you will have a lot of complications to deal with.  If this is not a valid interpretation, then that is a reasonable way to handle it.

Troy


> On Fri, Apr 2, 2010 at 1:48 PM, Cliff Hudson 
> <cliff.s.hudson at gmail.com>wrote:
> 
> > I have a string which I need to parse for IDs and 
> operators.  This is 
> > normally pretty easy, but there is one case where a 
> character in the 
> > ID can also match one character in the operator.  The tokens are:
> >
> > OP_TRANSFORM : '->'
> >
> > ID : (LETTER | '_') (options { greedy=true } : NAMECHAR)*
> >
> > fragment NAMECHAR : LETTER | DIGIT | '_' | '-' ;
> >
> > LETTER : 'a'..'z' | 'A'..'Z' ;
> > NUMBER: '0'..'9' ;
> >
> >
> > The issue is in parsing the following string:
> >
> > my-identifier->foo
> >
> > The ID token of course matches 'my-identifier-', and then I am left 
> > with an extraneous '>'.  Is there a way to construct a set 
> of lexing 
> > rules, possibly with actions, that would correctly separate 
> out the -> 
> > from the ID?  In this case, I want the '-' in OP_TRANSFORM 
> to be the 
> > preferred path and to match '->' even in the above case.
> >
> > Thanks.
> >
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: 
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 


More information about the antlr-interest mailing list