[antlr-interest] Ambiguous grammar or Antlr?

Jim Idle jimi at temporal-wave.com
Thu Jul 2 08:45:41 PDT 2009


Gustaf Johansson wrote:
> I have a complex grammar (900 lines) that parses incorrectly sometimes.
> The grammar is basically just an Antlr transformation of the ETSI
> defined TTCN3 BNF.
>
> Here is a snippet from my grammar (modified for simplicity):
>
>   assignment: ref '=' exp ';'? ;
>   ref: ('a' .. 'z' | '_') ('.' ('a' .. 'z' | '_'))* ;
>   exp: addExp ('+' addExp)* ;
>   ... "a lot of math exps of different precedences"
>   unaryExp: ('-' | '+')? primary ;
>   primary: opCall | value | '(' exp ')' ;
>   opCall: "this is a quite complex rule"
>   value: INT | ref ;
>
> Now if i parse:
>   v_some_var = v_some_other_var;
>
> I get:
>   line x:y no viable alternative at input ';'
>
> I suppose its because it expects '(' following a opCall-name or
> something, though im not sure.
>
> If i change the definition of primary to:
>   primary: value | opCall | '(' exp ')' ;
>
> It parses correctly.
> I have backtrack and memoize set to true.
>
> I just dont understand why Antlr wont even try the second option in
> the 'primary' rule, before reporting an error, this seems wrong to me.
> Could someone please shed some light on this for me?
>   
Basically, your backtrack and memoize are getting in the way of you 
seeing the issues in your grammar. A good tests would be to turn those 
options off and see what ANTLR is telling you about conflicts. 
Sometimes, the order of the alts is important and you might need to 
shift them around in primary.

However, unless this is a quick and dirty parser to do something once in 
a while, copying that normative specification for a language directly 
into ANTLR with backtrack=true is rarely going to work well I am afraid. 
Even with backtracking, you still need to know something about LL(k) and 
parsing etc, and switching on backtrack mode will hide all your 
problems. It also means, as you have found out, that in the event of a 
syntax error, you cannot determine much  about the location of your error.

Basically, you will need to left factor at least some parts of your 
grammar. In this case, if you have something like:

| ID
| ID '(' expr ')'
| ID something else

and (probably more complicated than that is your issue), then look to do:

| ID (  '(' expr
         | something
      )?

and so on.

In your position, I would turn off backtracking and either start again 
from scratch, starting with expression and gradually building up. This 
won't be as bad as it sounds because you can do a lot of copy and 
pasting I woudl bet.

Jim


More information about the antlr-interest mailing list