[antlr-interest] rules for re-factoring grammars?

Gavin Lambert antlr at mirality.co.nz
Fri Mar 21 13:27:26 PDT 2008


At 08:51 22/03/2008, Benjamin Shropshire wrote:
 >Does anyone known of a good wright up on the rules that can be 
used
 >to re-factor productions? I'm thinking in particular of rules 
that
 >can remove left recursion including with ambiguous grammars.
 >
 >A :
 >    F |
 >    A B |
 >    D |
 >    A C |
 >    E;

Well, the general replacement for the above that ANTLRworks can do 
automatically is:

A :
     F | D | E
     ( B | C )*
   ;

It's easy enough to see how it comes by this, especially if you 
look at the syntax diagram.  Each of the F, D, and E alts are 
terminals, since they don't reference another copy of 
A.  Therefore they can only occur once.

In the two remaining alts (A B and A C), the A is always on the 
left, so the terminals in the final rule must also always be on 
the left.  Now, since A is self-recursive, this means that "A B" 
could expand to "F B", or to "F B B", or even to "F C C B C B", 
and so on.  So clearly the non-A portions of these alts need to be 
added with a star, since they can occur any number of times and in 
any order.

As to whether there's a formal writeup on this sort of thing, I 
really have no idea :)



More information about the antlr-interest mailing list