[antlr-interest] Disambiguating simple grammar – could anyone help?

Gavin Lambert antlr at mirality.co.nz
Fri Apr 10 04:29:11 PDT 2009


At 22:43 10/04/2009, Tomasz Jastrzebski wrote:
>I cannot figure out how to disambiguate the following grammar 
>using syntactic predicate, so the range rule takes precedence 
>over the offset rule.
[...]
>expression
>  :
>    Identifier ((range) => range)?
>  | offset
>  ;
>range : Integer ('-' Integer)? ;
>offset : ('+' | '-') Integer ;

The problem is that since one expression can follow another 
without any separators, there is no way for ANTLR to tell if the 
input "foo 12-30" should be a single expression consisting of an 
identifier and a range, or two expressions consisting of an 
identifier and half-range for the first and an offset for the 
second.

ANTLR will normally default to the longest match (ie. the former), 
so what you already have should work ok, but it'll complain about 
it.  Unless you can remove the ambiguity from your input language 
or you can be more specific about how to tell the difference (eg. 
checking whitespace) there's probably not a whole lot you can do 
about it.

>Integer : ('0'..'0')+ ;

I'm going to assume that this was a typo in the email.

>WhiteSpace : (' ' | '\t' | '\r\n' | '\r')+ { $channel=HIDDEN; };

You should probably remove the \r from the third term -- otherwise 
this rule won't match files with UNIX line terminators, which are 
fairly common.



More information about the antlr-interest mailing list