[antlr-interest] New article on StringTemplates and Treewalkers

Tue Jan 10 15:18:33 PST 2006

>
>
>Well, the ANTLR translation is the farthest possible from each input  
>node goes to a single output node.  First, I don't do any tree  
>translation at all.  I parse the grammar into an AST from which I  
>derive an NFA from which I derive DFA.  I walk the AST multiple  
>times, annotating the tree and building other structures such as a  
>symbol table, NFA, DFA, etc...  The final codegen.g tree walker  
>guides translation but is hardly a simple "see a node, spit out a  
>node" kind of thing.
>  
>

OK, yea. Do use  ST in ANTLR 2.x, or is it new in 3.0? Do you have any 
pointers to where
I can find out more about how you use ST in ANTLR (or anyone else who is 
using ST
for complex stuff)? I guess I'm reading way to much into the simple 
examples you give in
these articles.

>You'll note that the following construct could be used for a subrule  
>(a|b):
>
>if ( lookahead consistent with a ) {
>   a();
>else {
>   b();
>}
>
>Now that pattern is similar to your
>
>strcasecmp(v1, v2) --> v1.compareToIgnoreCase(v2)
>
>example.  The key difference is that I just spent probably 5,000ms  
>walking all over hell and back figuring out what "lookahead  
>consistent with a" actually is.
>  
>
I think there's a key difference here. In your code, you have a mental 
picture of the AST
structure in your head (though of course it's trivial for this example). 
when I write:
old thing with v and x placeholders --> something else with x and v 
placeholders
I never have to have a mental picture an AST. I'm strictly thinking in 
terms of "patterns" or
"token streams".

>The disconnect I think you may have with ST is that you think it's  
>doing more than just spewing text.  
>
Well, I'm against both the treewalker and the ST approach. I understand 
that the STs are
just spitting out the text. Maybe at some point in the article I blamed 
STs when I should
have blamed treewalking, but the blame remains :)

>It cannot do any processing; you  
>must collect data, do analysis, do whatever you want and then use it  
>rather than print statements to dump stuff out. :)
>
>>/ Or, "show me a node in the ANTLR input AST, and I will show you the  
>/>/ equivalent
>/>/ node in the Java-version of the ANTLR output AST (probably without  
>/>/ thoroughly
>/>/ examining the whole AST - just looking at the one node".
>/
>The structure is there, yes: subrule to what template, but the  
>details are computed from rather involved analysis and the results  
>jammed in the template.  Templates only say what the output looks  
>like, not which output templates to use nor how to fill in data  
>values.  I've built lots of language translators over the years and  
>ANTLR is much harder than any language translator I've ever been  
>involved with.
>  
>
I don't doubt that. But then again, you're much smarter and more 
knowledgable than I am,
so I'll find things difficult that you don't :) It may be that 
treewalking and STs really are
the best solution to my problems, and I just can't get past their 
complexity and syntax.

>>/ But..."show me a node in a C AST (let's say INT_NUMBER "0"), and I  
>/>/ can't tell
>/>/ you what the equivalent node is in the output Java AST without a  
>/>/ thorough
>/>/ examination of the AST, both above and below the current node."
>/
>Yep, you have to do analysis to figure out the kind of construct to  
>generate.  When you know, then you ask the appropriate template to  
>spit stuff out :)
>
>i think we are actually in more agreement than you realize...i think  
>there is simply a disconnect with how tree walkers + ST would operate.
>  
>
So then let me ask, do you really consider ANTLR to have a "treewalking 
design"? How many
node actions does ANTLR have that are fairly trivial? I'll be interested 
to see what you say
about the specific issues I bring up.

My initial design did use the treewalker approach. I worked on it for a 
few months and I
was just overwhelmed with having to keep a mental picture of the AST 
structure.
I takes me a few seconds to type:

strcasecmp(v1, v2) --> v1.compareToIgnoreCase(v2)

And it would take me many times longer if I had to think about the AST
structure for those snippets of code.

>>/ So a morse-code-to-English translator is trivial, even though the  
>/>/ two ASTs are
>/>/ completely different. But a Spanish-to-Italian translator is  
>/>/ incredibly complex,
>/>/ even though the ASTs are similar. The difference is really the  
>/>/ extent of the
>/>/ amount of work that needs to be done in examining the input AST. In  
>/>/ ANTLR, you
>/>/ rarely have to look beyond the current AST node.
>/
>Boy I wish that were true!  I've spent almost 3 hard years building  
>ANTLR v3. ;)  Simple example.
>
>a : ID ... {$ID.text} ;
>
>I must do use-def chains for all labels and token references etc...  
>so that when I see $ID I know what it means; hardly a local AST node  
>reference.  Further, I have to go back to the code generator for the  
>ID and add a label so that $ID can actually work in the action. :)
>  
>
Right, same thing here. I often need a symbol table to lookup a variable 
type, but not only
that, I may need to go look at the declaration for various things (e.g. 
is it declared static?),
and maybe even do a check on every variable reference (is this ever used 
as an int? because
I want to change its type to boolean), and maybe even change every 
variable reference.

So in all that hard work for v3, how much is treewalking really buying 
you? In fact, I wonder
if you'd even say that what most of what ANTLR is doing is treewalking?

>>/ In C-to-Java (at least to the
>/>/ extent that I've done it), you usually do.
>/
>Your work looks excellent...and I like the pattern based approach for  
>small sets of patterns and hope to implement a general mechanism  
>sometime myself!
>
>>/ Andy
>/>/
>/>/ p.s. I sure hope there's a way to update an article on antlr.org :)
>/
>The old fashioned way. ;)  Send me a new copy ;)
>
>Ter
>
Thanks for the great feedback, and thanks for taking my comments the way 
they're
intended.

Andy