[antlr-interest] rules for re-factoring grammars?
Gavin Lambert
antlr at mirality.co.nz
Fri Mar 21 13:27:26 PDT 2008
At 08:51 22/03/2008, Benjamin Shropshire wrote:
>Does anyone known of a good wright up on the rules that can be
used
>to re-factor productions? I'm thinking in particular of rules
that
>can remove left recursion including with ambiguous grammars.
>
>A :
> F |
> A B |
> D |
> A C |
> E;
Well, the general replacement for the above that ANTLRworks can do
automatically is:
A :
F | D | E
( B | C )*
;
It's easy enough to see how it comes by this, especially if you
look at the syntax diagram. Each of the F, D, and E alts are
terminals, since they don't reference another copy of
A. Therefore they can only occur once.
In the two remaining alts (A B and A C), the A is always on the
left, so the terminals in the final rule must also always be on
the left. Now, since A is self-recursive, this means that "A B"
could expand to "F B", or to "F B B", or even to "F C C B C B",
and so on. So clearly the non-A portions of these alts need to be
added with a star, since they can occur any number of times and in
any order.
As to whether there's a formal writeup on this sort of thing, I
really have no idea :)
More information about the antlr-interest
mailing list