[antlr-interest] New article on StringTemplates and Treewalkers

Thu Jan 12 07:33:32 PST 2006

Gregg Reynolds wrote:

>
> Ok, syntactically, maybe the backend code is mixed up.  But 
> conceptually?  After all, what is the difference between many-one and 
> many one-one, rilly?

I would say that today, Jazillian is just a single one-to-one. There is 
no real "frontend" that could be switched out to handle a different
input language, or a "backend" where there's an ability to add 
additional backends to produce multiple output languages. I guess the only
way this is pertinent is that I'm taking buying what the StringTemplate 
article seems (to me) to be selling: that StringTemplate
can be an effective "backend". ANTLR may be able to use it that way,  I 
don't think Jazillian could use it to, say, output
C# in addition to Java. Not without major changes to the Jazillian 
engine itself.

>
>>
>>>
>>> So if it is a problem for Antlr, it is the same problem for 
>>> Jazillion or any other code xformer, regardless of implementation 
>>> technique.
>>
>>
>>
>> I do agree that (and I'm not sure if this is your point or not) ANTLR 
>> and Jazillian seem like they should both be designed the same way.
>
>
> Not at all, I'm only trying abstract in order to find the gist nut of 
> the problem.  After all, if you went to the trouble of trying antlr 
> and finding it lacking, there's something there, there.

Just to be clear: I love ANTLR for lexing and parsing, just not 
treewalking. Even for something that treewalking is best at: 
pretty-printing code,
I prefer to walk the tree "by hand" rather than use a treewalker. And 
(here's my whole point) I don't think treewalking is a good match for
something like a C-to-Java translator at all.

>
>
>>
>>> Nobody considers the machine code emitted by a compiler to be a 
>>> "view" of the source code.)
>>
>>
>>
>> Ah, but they do. I do, and  that's exactly what Terence is saying in 
>> the StringTemplate article...that the target Java, python, and bytecode
>> are simple three slightly different "views" of the output. I agree 
>> with that.
>>
>
> Well, you're a special case so we get to remove you from the sample.  ;)
>
> But the article was about a straightforward source to source 
> transformation - not machine code generation (Java byte code is not 
> machine code).  I wonder if you and/or Mr. Parr really think of 
> compiled code - machine code - as a "view" of the source.  Ordinarily 
> I mean - of course one can talk about it that way for special purposes.

Well, I'm the pessimist. I don't think you can even separate "the view" 
in the case of high-level languages. I'm not buying the "StringTemplate lets
you produce Java, C++, and bytecode all from one engine" theme of the 
StringTemplate article. So I obviously don't think it would work
for generating machine code either. Again, it ST great for the simple 
examples given, and so I tried to outline the real-world problems
that ST won't be able to solve. And those real-world problems would 
become huge if you tried to use ST to generate machine code.

>
>>>
>>> The real question is not separation of m v and c, but of the 
>>> *genericity* (adaptability, flexibility, whatever) of the "service": 
>>> given a parser generator, is its backend architecture general enough 
>>> to make it easy to write specialized emitters?  Given a language 
>>> transformer (e.g. Jazillion), is its frontend architecture general 
>>> enough to make it easy to specialize it for a variety of input 
>>> languages?
>>
>>
>>
>> In my case, I haven't cared too much (yet) that the frontend by able 
>> to handle multiple input languages (or that the backend be able
>> to output multiple languages for that matter). Just a single 
>> C-to-Java translator is hard enough, and I've been happy to spend 3 
>> years full time
>> thinking about all the ways to do that really well, rather than 
>> expanding my scope. Having said that, I'm now working on C++ to Java, 
>> though :)
>>
>>>
>>> More specifically:  how hard would it be to write an ML or Haskell 
>>> emitter for Antlr (something I'd like to see)?
>>
>>
>>
>> Good question, and my related question is "will StringTemplate make 
>> that any easier?".
>
>
> For the actual text generation, yes (I think); but that has nothing to 
> do with target v. source driven transformation strategies.

Right, so what I'm trying to say is getting clearer as I read more :)
I have to objections: one is the "target vs. source " (or, I prefer "AST 
walking vs. rule-based") architecture.
The second is the ability for ST to really add value (or scale) beyond 
things like producing C++/Java/C# output for ANTLR.

> [snip]

> Yep.  Although I daresay it depends on which language one is most 
> comfortable with.  In lisp dialects it's pretty straightforward to 
> thing in terms of something more treelike.  Then again, given the 
> mainstream resistance to all those parentheses...

After a year of LISP as an undergrad, I had trouble getting out of the 
LISP mindset.
Just kept thinking "Today is the first day of the rest of my life!"

>
>>
>> Avoiding mental pictures of AST trees altogether is just a HUGE 
>> productivity boost, at least for me.
>> I'd say I'm at least twice as productive in writing rules (both 
>> simple text-replacement ones and
>> complex ones written in Java code), and probably more like 5-10x more 
>> productive
>> by largely ignoring AST structures.
>
>
> That's interesting.  Can't argue with experience.  I suggest we cadge 
> a few million bucks out of the DOD to do a study.

Problem is, we all have different experiences. I think there are 3 major 
things coming into play here.
First is intelligence. Some people are so smart that they don't see the 
uglyness of code that the rest of us do, because
the code looks straightforward to them.
Second is experience. Obviously, someone who knows ANLTR really well is 
not thrown off by a mix of ANTLR and Java code.
Third is mindset.
<rant>
People in the compiler crowd tend to enjoy playing with symbols and 
languages. They enjoy discussing
the merits of various syntax issues and enjoy learning new languages. 
But there are those of us who want to build
real-world apps and use language tools, but just aren't into debating 
whether to use a '[' or a '{' at some point
and aren't smart enough or knowledgable enough to know how to convert a 
a NFA to a DFA. We've got our
BSCS and MSCS degrees and 20 years of development experience - we're not 
newbies. It's just that it can take
some work for us to see the benefits of a tool like ST or an approach 
like treewalking

</rant>
Andy

>
> -gregg
>