[antlr-interest] A postmortem of my use of antler

Tue Mar 11 15:06:17 PDT 2008

siemsen at ucar.edu wrote:
>
> Wow, I think this question is backwards.  I learned ANTLR from the 
> book as I wrote
> a nontrivial translator. 
Do you have a pointer to more info? We may just have different 
definitions for "non-trivial".
> I developed it using the time-honored tradition of adding only
> what was needed.  An AST never was, so my translator never grew an 
> AST.  Yet it
> does the job very well.  To me, evaluate-as-you-parse is the normal 
> case, and the
> question is "what are ASTs good for, other than speeding things up if 
> you need
> multiple passes?".
Actually, my translator also (for the most part) doesn't use ASTs, but 
instead has
a huge library of equivalent functionality that works on token streams.
And of course I use symbol tables.

 How do you look around at non-local parts of the input?
(for example, if you change the type of a variable, how do you then 
manipulate all
the references to it?)

How do you get around the need to know things about the tree? For example,
Java statements generally get a newline after them, but not the statement
"int i=0;" when it's inside a "for" construct:
for (int i=0;
>
> There are many languages (most?) that can be translated without an AST 
> phase.
Can you give an example? I can't imagine trying to translate as part of 
the parser to
convert among any modern high level languages (C, C++, Java, COBOL, fortran,
lisp, etc.)
> If I don't need one, why bother?  Perhaps ASTs add some nice 
> modularity, or
> compartmentalize semantic errors or something? 
No, they're essential. The parse is just a tiny phase
to get the input into a useful data structure. For a non-trivial 
translator, I'd say the
parser is a tiny, trivial part - less than 0.1% of the translator. I 
would imagine that
99.9% (the translator) embedded with that 0.1% (parser) code - sounds 
pretty ugly.
> I'm ready to be convinced, but I
> want some value to compensate for the complexity they add to the 
> translation process.
To convince me, convert:
char *s = "Hello, ";
printf("%s %s\n", s, "World");
...to...
System.out.println("Hello, World");
..and explain how you did it without an AST.

I think the answer would have to be:
"At the METHOD_CALL part of the grammar, see if the method is named
"printf", if so, call processPrintfCall() where all the work is done."

Some of the questions that arise:
* don't you now have hundreds of methods like this one, and isn't it 
ugly to have that
   all inside the parser?
* How do you pass values like "s" around?
* How do you know that "s" can be eliminated (i.e. control flow analysis)
* How do you know whether the printf() call was to the system library, 
as opposed to
   some application-specific method called "printf()" (i.e. symbol tables)

you get the idea. If the answer is just "that's just a library, I only 
deal with the core language",
then how do you deal with language features that are available in the 
one language but not
the other?
>
> I took a compiler class at the university many years ago.  We 
> used yacc/lex and the
> famous Aho/Ullman dragon book.  I agree with a previous post, 
> lexing/parsing is
> *hard* to do well.  Now I'm done with the theory, and I just want to 
> get the job done.
> IMHO, ANTLR is the next generation of yacc/lex, and is a great leap 
> forward.  Many
> thanks to Terence for encapsulating the concepts in code so well. 
>  Again, I'm done with
> the theory, so if Terence were to proclaimed "ASTs are 
> good, you should always use
> them" I wouldn't argue, I'd just do it.
I think one should take arguments on their merits and use your own 
experience.
Terence is probably the best in the world at what he does. But the best 
car designer or mechanic
in the world is not necessarily the best driver. Most rocket scientists 
would
not be the best astronauts.

In fact, I'd argue that being really, really smart actually *hurts* 
one's ability to empathize
with the average users.
>
> I disagree with the original posting - using ANTLR is far superior to 
> hand-coding a
> parser, which I've done.  I agree that the non-book documentation 
> isn't great,  but the
> simple solution is "get the book".  Even with the book, questions like 
> "should I
> use an AST or not" make learning hard on newbies.  Some of us still 
> haven't figured
> that one out :-)
Yea, I agree. I've been looking at the Javac code - all hand-written, 
and could be ANTLRized
fairly easily. The other thing, though, is that it may not matter much. 
If your translator
(or java compiler) really is non-trivial, the parsing part is relatively 
stable and trivial.
(easy for me to say, right? I just grabbed a working C parser).
>
> -- Pete
>
>
>
>