[antlr-interest] New article on StringTemplates and Treewalkers
Terence Parr
parrt at cs.usfca.edu
Tue Jan 10 14:37:47 PST 2006
On Jan 10, 2006, at 2:05 PM, Andy Tripp wrote:
>> I will add initially, however, that you seem to be sweeping under
>> the rug the one big counterexample of your statement, "one
>> assumption is that the structure of the input and output language
>> ASTs are similar." I'm pretty sure that ANTLR's input / output
>> languages are vastly more different than your C->Java
>> translation. ;)
>
> Yes, good point! I guess it's not so much an issue of "is the output
> language/AST similar to the input one". It's really "can each piece
> of the output language/AST be derived from a single piece of the
> input language/AST".
Well, the ANTLR translation is the farthest possible from each input
node goes to a single output node. First, I don't do any tree
translation at all. I parse the grammar into an AST from which I
derive an NFA from which I derive DFA. I walk the AST multiple
times, annotating the tree and building other structures such as a
symbol table, NFA, DFA, etc... The final codegen.g tree walker
guides translation but is hardly a simple "see a node, spit out a
node" kind of thing.
You'll note that the following construct could be used for a subrule
(a|b):
if ( lookahead consistent with a ) {
a();
else {
b();
}
Now that pattern is similar to your
strcasecmp(v1, v2) --> v1.compareToIgnoreCase(v2)
example. The key difference is that I just spent probably 5,000ms
walking all over hell and back figuring out what "lookahead
consistent with a" actually is.
The disconnect I think you may have with ST is that you think it's
doing more than just spewing text. It cannot do any processing; you
must collect data, do analysis, do whatever you want and then use it
rather than print statements to dump stuff out. :)
> Or, "show me a node in the ANTLR input AST, and I will show you the
> equivalent
> node in the Java-version of the ANTLR output AST (probably without
> thoroughly
> examining the whole AST - just looking at the one node".
The structure is there, yes: subrule to what template, but the
details are computed from rather involved analysis and the results
jammed in the template. Templates only say what the output looks
like, not which output templates to use nor how to fill in data
values. I've built lots of language translators over the years and
ANTLR is much harder than any language translator I've ever been
involved with.
> But..."show me a node in a C AST (let's say INT_NUMBER "0"), and I
> can't tell
> you what the equivalent node is in the output Java AST without a
> thorough
> examination of the AST, both above and below the current node."
Yep, you have to do analysis to figure out the kind of construct to
generate. When you know, then you ask the appropriate template to
spit stuff out :)
i think we are actually in more agreement than you realize...i think
there is simply a disconnect with how tree walkers + ST would operate.
> So a morse-code-to-English translator is trivial, even though the
> two ASTs are
> completely different. But a Spanish-to-Italian translator is
> incredibly complex,
> even though the ASTs are similar. The difference is really the
> extent of the
> amount of work that needs to be done in examining the input AST. In
> ANTLR, you
> rarely have to look beyond the current AST node.
Boy I wish that were true! I've spent almost 3 hard years building
ANTLR v3. ;) Simple example.
a : ID ... {$ID.text} ;
I must do use-def chains for all labels and token references etc...
so that when I see $ID I know what it means; hardly a local AST node
reference. Further, I have to go back to the code generator for the
ID and add a label so that $ID can actually work in the action. :)
> In C-to-Java (at least to the
> extent that I've done it), you usually do.
Your work looks excellent...and I like the pattern based approach for
small sets of patterns and hope to implement a general mechanism
sometime myself!
> Andy
>
> p.s. I sure hope there's a way to update an article on antlr.org :)
The old fashioned way. ;) Send me a new copy ;)
Ter
More information about the antlr-interest
mailing list