[antlr-interest] Left-factoring and recursion

Nigel Sheridan-Smith nbsherid at secsme.org.au
Mon Jan 10 21:30:52 PST 2005

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Loring Craymer
> Sent: Tuesday, 11 January 2005 3:15 PM
> To: antlr-interest at antlr.org
> Subject: RE: [antlr-interest] Left-factoring and recursion
> Maybe I was too terse in my response.  My point is more "Don't left
> factor,
> use syntactic predicates first".  In your situation, left factoring is a
> form of premature optimization.  I don't suggest waiting for ANTLR 3;
> however, the fact that it will automatically left factor and remove
> unneeded syntactic predicates should give you pause before you take on the
> task of manually refactoring your grammar.  [Of course, the tradeoffs
> change if performance is a key requirement; even then, the preferred
> strategy is to get a working grammar and then left factor manually if you
> need the performance and cannot wait for ANTLR 3 to solve the performance
> problem.  In that case, you tend to start by refactoring top-level rules
> and working your way down.  In practice, though, left factoring tends to
> buy little improvement over synpreds.]

Hi Loring,

Thanks for the information... As I am working more and more on this I am
discovering that a significant part of the problem is that the grammar of
ASN.1 is highly ambiguous throughout. Curly braces are used for just about
everything, and it's the main source of the heart-ache.

The other problem is that the ASN.1 standard productions are highly
convoluted - to work out *what* the syntactic predicates should be is what
is taking all the time. For each non-determinism, you have to track down
*all* of the exit branches that are being predicted and then work out what
is the best alternative. The recursion that is involved just makes things an
order of magnitude worse. Also, if the grammar didn't excessively use
productions where they weren't necessary then it might be a lot easier to
track down the exit branches.

After spending quite a few hours getting the grammar into SableCC, I managed
to get it to compile but then I started to hit shift-reduce and
reduce-reduce conflicts (I expected this - but didn't know what sort of
conflicts to expect since I've never used LR(1) :( ). After a few hours of
reading on LR parsers, and then removing a few reduce-reduce conflicts by
factoring, I hit my first shift-reduce conflict. Again, it's the same sort
of problem as with LL(k) - find all the possible exit branches and then work
out how to give preferential treatment to one alternative to another. 

Unfortunately, SableCC does not have a way to give shifts preference over
reductions - this is definitely the strength of syntactic predicates.
Yacc/bison (on the other hand) appear to have some sort of "default"
processing for these sort of conflicts, where shifts always take
precendence, etc. The only other option with SableCC is to create
disambiguating rules, which are even more complex in themselves.

So I think using SableCC is definitely less of an option - I think I will
probably forget about doing my own grammar now since I probably have enough
ASN.1 syntax under the belt to work out what is going on with the existing
ANTLR ASN.1 grammar. It's probably going to be good enough for what I need,
and probably less work involved - i.e. the most "bang for my buck". Even if
its not deterministic (or unpredictable) then at least I can work those
things out down the track when they become more relevant.

Still, having some sort of tool to help with solving complex
non-determinisms, left recursion, etc would be handy, even if it sorted
through the grammar and dumped all the relevant production rules. However,
if ANTLR 3.x will solve all these problems, then I guess such a tool will
quickly go out of fashion! In any case, it's only for grammars that are
already well established (i.e. pre-existing standards) - it's a shame the
people involved in designing these languages can't make them deterministic
from the beginning! 

Cheers, and thanks for all the clarification and assistance... 


Nigel Sheridan-Smith
PhD research student

Faculty of Engineering
University of Technology, Sydney
Phone: 02 9514 7946
Fax: 02 9514 2435 

More information about the antlr-interest mailing list