[antlr-interest] New article on StringTemplates and Treewalkers
dev at arabink.com
Wed Jan 11 09:34:18 PST 2006
Andy Tripp wrote:
> In a fit of reverse-writer's-block last night, I wrote down
> some thoughts on AST treewalking and StringTemplate, titled
> "Why I don't Use StringTemplate for Language translation"
> The article is here: http://www.jazillian.com/stringTemplate.html
A few holes to poke in your article. Which I mean in the nicest
From your paper: "But the main rationale for separating the "view"
from the "controller" and "model" is so that we can have multiple
"views", and that we can easily change the "view" without having to
touch the "model" or the "controller. Certain applications may have
multiple "views" (ANTLR, for example, which takes a single input in
ANTLR-language, but generates Java code for Java programmers, C code for
C programmers, etc). But for other applications, such as a
"Any-dialect-of-C to Java" or "C or C++ to Java", the mapping is
many-to-one, not one-to-many."
Isn't this a false dichotomy? The same considerations apply to both
situations. If antlr can do many-to-one (source grammar to a variety of
target languages) that is only because somebody took the trouble to
write the target generation code. It's not one-to-many, but many
one-to-ones. This is exactly what happens with a many-to-one mapping
(variety of source languages to one target language): for each source
language somebody has to take the trouble to write the transformation
code, and you again end up with many one-to-ones.
So if it is a problem for Antlr, it is the same problem for Jazillion or
any other code xformer, regardless of implementation technique.
Actually I think "MVC" is probably not the best idiom for discussion
parsing and transformation, coming as it does from the world of
graphical representation of data. (Personally I don't find it useful to
think of the result of a translation as a "view" of the source; e.g.
calling the parser code generated by Antlr a "view" of the source
grammar doesn't work for me. Nobody considers the machine code emitted
by a compiler to be a "view" of the source code.)
The real question is not separation of m v and c, but of the
*genericity* (adaptability, flexibility, whatever) of the "service":
given a parser generator, is its backend architecture general enough to
make it easy to write specialized emitters? Given a language
transformer (e.g. Jazillion), is its frontend architecture general
enough to make it easy to specialize it for a variety of input languages?
More specifically: how hard would it be to write an ML or Haskell
emitter for Antlr (something I'd like to see)?
How hard would it be to write an ML or Haskell front-end for Jazillion?
(I mean relative to a C frontend, not relative to a backend to Antlr,
which would no doubt be easier.)
(Note GCC is a good example of genericity both on the front and back ends.)
A general observation: you contrast the Antlr (AST) approach to
"pattern-matching" in a few places (e.g. "is what you've got using
StringTemplates and AST walking better than what you'd have with some
(unspecified here) pattern-matching approach?"
But parsing *is* pattern matching, no? So it isn't clear (to me) what
exact contrast you're trying to establish.
One of the examples you give to illustrate the difficulty of AST-walking:
2. At any "printf function" node, loop through the format string and
arguments, and do lots of processing to replace them with Java using the
My understanding is that you would just write a production for the
grammar of the args of the printf function, which you could take
directly from the C grammar, augmented by info from the printf
definition in the library. The "lots of processing" must occur
regardless of implementation strategy, but in Antlr the grammar
recognition part (looping through the format string and args) is clear
Correct me if I'm wrong, but I get the impression you're thinking about
writing by hand a bunch of the AST parsing logic that Antlr generates
automatically for tree grammars, rather the way you might need to
proceed if you were using a less sophisticated parser generator
(lex/yacc, etc.) In that case, yes, it would definitely be a pain
because you might need to do it all by hand. But if I understand Antlr
correctly, it saves you the trouble by supporting tree grammar. So the
interesting contrast is not necessarily between your approach and
Antlr's, but between Antlr v. other parser generators.
All for now. I'm not sure I agree with your paper, but it has certainly
More information about the antlr-interest