[antlr-interest] Left-factoring and recursion

Loring Craymer Loring.G.Craymer at jpl.nasa.gov
Mon Jan 10 20:15:24 PST 2005

At 02:38 PM 1/10/2005, Nigel Sheridan-Smith wrote:

> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Loring Craymer
> > Sent: Monday, 10 January 2005 4:01 PM
> > To: antlr-interest at antlr.org
> > Subject: RE: [antlr-interest] Left-factoring and recursion
> >
> > > -----Original Message-----
> > > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > > bounces at antlr.org] On Behalf Of Nigel Sheridan-Smith
> > > Sent: Sunday, January 09, 2005 2:38 PM
> > > To: antlr-interest at antlr.org
> > > Subject: [antlr-interest] Left-factoring and recursion
> > >
> > >
> >
> > Nigel--
> >
> > Use syntactic predicates as a first step for removing non-determinisms.
> > Some of the ANTLR 2 reported non-determinisms are not real, and result
> > from
> > the approximate LLk algorithm.  Once you have a working grammar, left-
> > factor
> > if you need the added performance.  I note that ANTLR 3 does/will do
> > automatic left-factoring and will remove predicates where they are
> > unneeded;
> > does it make sense to spend a lot of effort manually left-factoring when
> > the
> > tool will do it for you within a year?
>Thanks Loring,
>Like most projects, I needed this yesterday or about a year ago even, so
>waiting for ANTLR 3 is not really an option ;)

Maybe I was too terse in my response.  My point is more "Don't left factor, 
use syntactic predicates first".  In your situation, left factoring is a 
form of premature optimization.  I don't suggest waiting for ANTLR 3; 
however, the fact that it will automatically left factor and remove 
unneeded syntactic predicates should give you pause before you take on the 
task of manually refactoring your grammar.  [Of course, the tradeoffs 
change if performance is a key requirement; even then, the preferred 
strategy is to get a working grammar and then left factor manually if you 
need the performance and cannot wait for ANTLR 3 to solve the performance 
problem.  In that case, you tend to start by refactoring top-level rules 
and working your way down.  In practice, though, left factoring tends to 
buy little improvement over synpreds.]

Synpreds should enable you to remove non-determinisms (even the ones 
arising from LLk-approx) from your grammar within a day, even for a large 
grammar.  Only after you have a working grammar should you worry about 
performance and whether a large scale effort to refactor your grammar makes 

So I would suggest that you continue with your own grammar (mostly because 
it is most familiar to you and sounds like it is almost working), but 
insert synpreds and defer left-factoring until and unless performance 
becomes a critical issue.  Even then, profile first.


>Some of the options are:
>1) Continue on with the existing ANTLR ASN.1 grammar - I've tried running it
>with some simple MIBs but it strikes some trouble when it gets to some of
>the SMI-specific things and the processing of OIDs. However, if I strip out
>all the actions and extend it from there maybe that will work. The grammar
>seems a little incomplete, but I'm still not 100% familiar with ASN.1
>syntax, so it's hard to tell if anything is missing.
>2) Continue left-factoring, simplifying and removing recursion from my own
>LL(k) grammar. I kept doing this yesterday with some success, but there are
>multiple levels of recursiveness throughout the grammar and it could
>potentially take weeks to sort through it all and then add actions, etc.
>3) Try a LALR or other LR parser - I have put almost all of the grammar into
>SableCC but I am being to realise how much power ANTLR has in comparison! I
>will find out pretty soon how much work it's going to be to get the whole
>grammar up and running! In any case, I'm going to end up with several
>hundred generated java classes unless I can factorise the grammar down to
>less rules. However, it appears that SableCC only allows one copy of each
>terminal in any production!! Very annoying! I'd try yacc, etc but I really
>just want something that presents the SMI information through a Java API.
>4) Link to the 'libsmi' library through the Java Native Interfaces - this
>library isn't perfect, and I would lose some cross-platform compatibility,
>but it's looking like a relatively easy option at this stage.
>5) Buy a licence to Java 'jasmi' library (http://www.mibdesigner.com/jasmi/)
>or another equivalent tool. However, I'm a relatively poor student, so would
>be reluctant to fork out the money on my own. Also, what happens if this
>project then goes commercial later on down the track? If the product I
>choose requires single-user licenses or royalties then it will be expensive
>to on-sell another product that uses that library.
>So I'm leaning towards (3), (1), (2)/(4) or (5) in approximately that order.
>If anyone has any suggestions or advice, I'm all ears! :D

More information about the antlr-interest mailing list