[antlr-interest] [SPAM] [SPAM] Syntax ambiguity?

Kevin J. Cummings cummings at kjchome.homeip.net
Fri Mar 4 00:27:13 PST 2011


On 03/04/2011 03:02 AM, Olivier Lefevre wrote:
> Sorry, the subject is not very informational but I cannot
> get the hang of the problem, so I cannot devise a better
> subject. I have this small grammar:
> 
>    grammar Gr3;
> 
>    options { output=AST; }
> 
>    stat : fun1 | fun2 ;
>    fun1 : 'fun1(' ID1 ')' ;
>    fun2 : 'fun2(' ID2 ')' ;
> 
>    fragment DIGIT  : '0'..'9' ;
>    fragment LETTER : ('a'..'z' | 'A'..'Z') ;
> 
>    ID1 : (DIGIT | LETTER)+ ;
>    ID2 : (DIGIT | LETTER | '_' | '-' | '.')+ ;
>    WS  : (' '|'\t')+ { skip(); } ;
> 
> It can recognize, say, fun1(AB) and fun2(AB_CD) as expected
> but not fun2(AB), which should also be valid since AB matches
> both ID1 or ID2. Rewriting fun2 as
> 
>    fun2 : 'fun2(' (ID1 | ID2) ')' ;
> 
> works but is not satisfactory because I want an ID2 as fun2
> argument, not an ID1. So, how can I force ANTLR to "consider"
> ID1 in this position?

The usual answer would be to rewrite ID1 and ID2 as:

fragment ID2 : ;
ID1 : ( DIGIT | LETTER | ( '_' | '-' | '.' ) { $type = ID2; } )
    ;

The problem is that if only ID2 types are legal in fun2, you might end
up with an ID1 in a fun2 which you would have to change the type of "on
the fly" to ID2, even though the token was lexed as ID1.

> Thanks,
> 
> -- O.L.

I hope this helps....

-- 
Kevin J. Cummings
kjchome at verizon.net
cummings at kjchome.homeip.net
cummings at kjc386.framingham.ma.us
Registered Linux User #1232 (http://counter.li.org)


More information about the antlr-interest mailing list