[antlr-interest] Re: antlr vs. sableCC comparison

Sat May 24 21:44:24 PDT 2003

--- In antlr-interest at yahoogroups.com, "Oliver Zeigermann" 
<oliver at z...> wrote:
> --- In antlr-interest at yahoogroups.com, "lgcraymer" <lgc at m...> 
wrote:
> > My reaction to SableCC has always been, "Why?", and the answer 
> > always seems to be "it was a good excuse for Gagnon to have fun 
> > writing a master's thesis". It has less functionality than 
either 
> > ANTLR or JavaCC and was introduced after both were available. 
> 
> At the time he wrote SableCC, only few *LALR* generators for Java 
> were around and it was yet undecided which would be the best, so 
he 
> just tried to do something good. As you said, both ANTLR or JavaCC 
> are *LL*.

I don't see the issue as being LL versus LR here--more one of 
reinventing the wheel, and then not inventing a better wheel.  For 
that matter, there are much more egregious examples than SableCC, 
which was at least an attempt to be the first kid on the Java block 
and not just a "me too" yacc clone.

> > Predicated LL(k) (ANTLR) parsing can handle any context-free 
> > grammar, but LALR does not and modern versions of yacc support 
GLR 
> > parsing to get past the LALR limitation.
> 
> I think you mix something up here (or do I?). Surely neither YACC 
> nor SableCC nor ANTLR can handle all context free grammars! What 
you 
> wanted to say is any context free language, i.e. a language that 
can 
> be described by a context free grammar, can also be described by 
an 
> ANTLR grammar, right?

Yes, that's a bit more precise.  And there is also a constraint on 
the grammars being finite.

> 
> Anyway, please tell more about GLR. How does it work?

I'd suggest looking at a few papers on citeseer 
(http://citeseer.nec.jp)--just do a search on GLR parsing, or Earley 
parsers, or Tomita parsers.  Any short description (generate all 
possible parses to build a parse forest and then match a tree in the 
forest) tends to be incoherent.  Or check into one of the recent 
versions of yacc which incorporate an Earley parser.

> Personally, I am fan of ANTLR and LL, but still, there are some 
good 
> reasons to choose bottom up methods. Some purists like the idea of 
> describing a language completely by the means of a most human 
> readable and thus natural (hopefully context free) grammar. To use 
> such grammars with LL or LR methods you will have to modify them. 
It 
> turns out that LR methods need less modification of the grammar 
than 
> LL, which is considered valuable by some pepole.

But a little more care on placing actions to avoid surprises--that's 
a practical tradeoff.  Also, the "less modification" is open to 
question--the issues tend to be left factoring and placement of 
syntactic predicates, both of which could be automated during 
grammar analysis instead of requiring programmer intervention.  This 
is a performance versus ergonomics issue, and I think that Ter made 
the right decision in this case--the rewrites are usually pretty 
minor.  For ANTLR 3, we have talked about IDE support for automatic 
placement of syntactic predicates.  (The big advantage of this would 
be to determine minimum-length semantic predicates automatically.)

For me, the two things that make LL preferable to LR for development 
are 1.) freeform action placement, and 2.) generation of humanly 
understandable recursive-descent code which can be debugged.  These 
aren't strictly LL versus LR items:  LR parsers could be implemented 
to avoid the action placement surprises, and they could be 
implemented to generate human-understandable code (probably at a 
cost in parser performance).  For that matter, there have been table-
driven LL(1) parser generators.

> 
> > ANTLR's tree grammar approach is more powerful than the visitor 
> > approach.  In fact, a visitor can be expressed as a special 
ANTLR 
> > tree (I've not tested this code, but it should work):
> > 
> > visitor
> >     :
> >     ( visit )+
> >     ;
> > 
> > visit
> >     :
> >     (      .
> >     |      #( .  ( visit )+ )
> >     )
> >     ;
> 
> I do not think that will work (maybe it does). How to decide 
between 
> first and second alternative of rule visit?
>
> This will do, I think (visitor rule unchanged):
> 
> visit
>     :
>     #( .  ( visit )* )
>     ;

That should work.

> Oliver

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/