[antlr-interest] Antlr2.0 to Antlr3.0

David Wigg wiggjd at bcs.org.uk
Fri Aug 8 12:33:24 PDT 2008


As the latest author of the C++ parser I sympathise with the frustration
expressed by Ian.

Having struggled against inadequate documentation to convert the grammar
from PCCTS to Antlr1.0 and then again to use Antlr2.0 I had hoped Antlr3.0
would be easier especially as there was the Book, but no, it's just as
difficult.

I am still trying to give a helping hand to help a brave soul to convert the
C++ grammar to Antlr3.0 so as not waste all those past hours but it is a
desperate business. I just don't have those spare hours any more.

David.

Message: 3
Date: Thu, 7 Aug 2008 19:41:35 -0700
From: "Ian Kaplan" <iank at bearcave.com>
Subject: [antlr-interest] ANTLR version 2.X to ANTLR version 3.X (the
       horror, the horror)
To: antlr-interest at antlr.org
Message-ID:
       <7f8924df0808071941x52edecf1mbaf44a1e0ad8b076 at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"
 As part of a research project at the Lawrence Livermore National Lab, I
have implemented a graph database query language parser using ANTLR (my
personal motto is "Death to SPARQL").  The ANTLR 2.X parser consists of an
ANTLR lexer (scanner) and and ANTLR grammar rules with semantic actions
(Java code) that builds data structures as a result of the parse.  The
language is simple enough that abstract syntax trees are not really
necessary.
 I have long been a fan of ANTLR and when Terence's book came out, I bought
a copy.  I spent most of the day struggling to convert about 2,000 lines of
comments and grammar for the query language from 2.X to 3.X.  It has been a
frustrating experience.  I have poured over the book and any documentation
that I can find on antlr.org.  My initial impression was that there were not
that many differences between ANTLR 2.X and 3.X.  This does not seem to be
the case, at least for my grammar which consists of a lot of embedded Java.
 More examples which semantic actions (Java code) would be very helpful.
It took me some time, for example, to understand that the
 @init{
 }
 @after{
  }
 Blocks follow each other (I do note the example on page 144 of the book).

 My 2.X code has syntax like this:
     t:TOKEN   (for example t:LPAREN)
 I then reference t.getFile(), t.getLine() and t.getColumn() in my Java
code.  I have not figured out yet how to do this in 3.X.  I'd be grateful
for any pointers.
 My 2.X code also had grammar like
 tokens {
    ADJACENCY = "adjacency";
   PATTERN = "pattern";
 }
 These are reserved words in the query language.  I really don't like the
habit in the example code of using quoted strings like 'adjacency' in the
grammar rules.
  I actually found what seemed to be 3.X examples using the above tokens
syntax.  However, it doesn't seem to work.  The proper form seems to be:
 tokens {
     ADJACENCY : 'adjacency';
    PATTERN : 'pattern';
 }
 As noted in the 2.X to 3.X documentation, there's no built in way to
create case insensitivity without overriding the scanner input stream.
 The good news is that there's documentation, but for some reason with
ANTLR there never seems to be enough documentation to make the initial
learning curve anything but painful.
 I noticed that the person who maintains the 2.X C++ grammar is looking for
someone to take it over since they don't want to deal with the conversion to
ANTLR 3.X.  I can't say I blame them.   My grammar is a lot smaller and it's
going to be at least a two day slog with a fair amount of frustration.
 In addition to the fact that the 2.X grammar is obsolete, I'm doing the
conversion because I am hoping that the LL(*) will avoid left factoring my
grammar into a less clear form.  I hope that I am not disappointed.
 Ian
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.antlr.org/pipermail/antlr-interest/attachments/20080807/2e6dc04e/attachment-0001.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080808/c3fb36eb/attachment.html 


More information about the antlr-interest mailing list