[antlr-interest] rules for re-factoring grammars?

Benjamin Shropshire shro8822 at vandals.uidaho.edu
Fri Mar 21 17:12:56 PDT 2008


Gavin Lambert wrote:
> At 08:51 22/03/2008, Benjamin Shropshire wrote:
> >Does anyone known of a good wright up on the rules that can be used
> >to re-factor productions? I'm thinking in particular of rules that
> >can remove left recursion including with ambiguous grammars.
> >
> >A :
> >    F |
> >    A B |
> >    D |
> >    A C |
> >    E;
>
> Well, the general replacement for the above that ANTLRworks can do 
> automatically is:
>
> A :
>     F | D | E
>     ( B | C )*
>   ;
>
> It's easy enough to see how it comes by this, especially if you look 
> at the syntax diagram.  Each of the F, D, and E alts are terminals, 
> since they don't reference another copy of A.  Therefore they can only 
> occur once.
>
> In the two remaining alts (A B and A C), the A is always on the left, 
> so the terminals in the final rule must also always be on the left.  
> Now, since A is self-recursive, this means that "A B" could expand to 
> "F B", or to "F B B", or even to "F C C B C B", and so on.  So clearly 
> the non-A portions of these alts need to be added with a star, since 
> they can occur any number of times and in any order.

Actually I'd already figure that one out along with a scheme for 
refactoring action rules to match. The issue I'm worried about is; will 
the second grammar ever get a different parser tree if, for instance, 
the grammar is ambiguities?

After some thinking I think I can answer my own question; yes it will

add in this:
------
H : A I* G;

I : B | C;
------

then parse an H out of "FBCG" (assuming the needed actions to rebuild 
the left recursive tree)

old:    FBCG -> h(a(F), i(B), i(C),G);
new:   FBCG -> h(a(a(a(F),B),C),G);

Am I missing something here?

(note: I'm not working specifically with antlr here.)


More information about the antlr-interest mailing list