[antlr-interest] Re: ANTLR vs lex/yacc, debugging and ANTLR 3

Ian Kaplan iank at bearcave.com
Sun Jan 16 15:50:17 PST 2005


> I have seen many people comment on how they dislike yacc/lex vs
> ANTLR.

  I can't speak for the multitudes of ANTLR users in their preference
  for ANTLR over YACC, but I did write a web page "Why Use ANTLR"
  which can be found here:

   http://www.bearcave.com/software/antlr/antlr_expr.html

  I find ANTLR more sophisticated in the way it handles synthesized
  results (results of grammar productions) and arguments passed down
  the grammar hierarchy.  In a similar vein, I like the way local
  initialization code blocks can be defined.  ANTLR also generates
  readable code code, in contrast to YACC's tables.  So you can see
  that it is generating what you intended.

  The issue of where the recursive is (left or right) does not have
  any traction for me.  I find it easy enough to write the grammar
  either way.  And I'm not a grammar theorist either.

  Debugging both YACC and ANTLR can be difficult.  I was reminded of
  this recently while working on an ANTLR grammar.  Because errors are
  reported as a result of how the grammar logically expands, with both
  YACC and ANTLR you get an error reported for a grammar location that
  may be far away from the place in the grammar that caused the
  problem.  This makes grammars painfully difficult to debug.  Here I
  don't find ANTLR much of an improvement over YACC.

  While on this topic: I've been meaning to write about the issue of
  debugging grammars.  So I guess that I'll use this as an
  opportunity.

  I've just finished a grammar for a query language.  It has
  expressions at a number of different levels and these expressions
  are recursive.  The grammar has been painful to debug.  The only way
  I could debug it was to start with the core expression and keep
  adding the productions above it until I found the production that
  broke the grammar.  This is very time consuming and unpleasant.

  It is very, very difficult to understand the cause of a problem in a
  complex grammar.  This understanding is equivalent to expanding out
  the grammar trees until you understand where the problem lies.  In
  practice this is very difficult to do.

  There has been some discussion here about ANTLR 3.  In considering
  the features for ANTLR 3, I would concentrate on the core of what I
  believe is the advantage that ANTLR delivers: generation of parsers.

  I wrote my own scanner for this query language.  It really was not a
  big deal.  Scanners are pretty easy to write.  ANTLR makes is easy
  to integrate a scanner with the parser.  I would not complain if the
  scanner generation capability disappeared from ANTLR.  The
  concentration on scanner performance in ANTLR 3 is, I think,
  misplaced.  There is only so much time and I think it is a good idea
  to concentrate on the core advantages of ANTLR: parser generation.

  I will not bore y'all again with my discussion on tree generation,
  except to state that this is not a feature I use either and would
  not be sad to see less time spent on it.

  What would have saved me several days of work is better features for
  debugging the grammar.  Perhaps something that would help expand out
  the grammar so I can see where the problem lies.

  Except for the difficulty of debugging ANTLR grammars, I'm pretty
  happy with ANTLR.  There is not a lot that I can think of that I'd
  like to change.

  What would help people in general, Terence, is if you finished your
  book.  One of my colleagues, who is an Oracle specialist who worked
  with me on the back end of the query language tried made a start at
  the query language parser.  He pretty much failed.  He had no idea
  of how to proceed.  Most people who have not written parsers before
  or used parser generators are going to have a pretty difficult time
  with ANTLR.  The existing documentation does not provide much to go
  on and web pages like mine are of limited help as well (my web pages
  concentrate on a C++ parser and our query language parser is
  targeted at Java).

  Ian Kaplan


More information about the antlr-interest mailing list