[antlr-interest] tree --> tree transformation

Tue Jun 20 11:07:05 PDT 2006

On Jun 19, 2006, at 10:31 PM, Bernhard Damberger wrote:
> I was trying to make a tree grammar with Antlr 3.0 that spits out
> another modified tree.

Uh oh!  You caught me!  I have not implemented that feature yet...I'm  
not sure exactly how I want to do it.

> I just want to transform a tree from one form to another.

That's the issue... you can see that the basic mechanism borrowed  
straight from parser -> AST almost works.  My concern is how to deal  
with sharing tree nodes.  Clearly if you only want to go through and  
replace x*0 with 0 everywhere in expressions you should be able to  
reuse the rest of the tree; i.e., without dup'ing it.

I *think* the way to deal with that is that if you don't create a  
tree, return the incoming tree pointer as the result tree.

> Two kinds of lines of code are causing problems:
>              adaptor.setTokenBoundaries 
> (retval.tree,retval.start,retval.stop);
>
> In this case retval.start and retval.stop are Objects, not Tokens.

Yes, that would need to be turned off when building trees from a  
tree; those values are set and should not be touched.

> And
>
>                  string_literal4=(Object)input.LT(1);
>                  match(input,118,FOLLOW_118_in_statement63);
>                  string_literal4_tree =(Object)adaptor.create 
> (string_literal4);
>
> on the third line because string_literal4 is expected to be a token,
> but its a Object.

We could add that method create(object) to the adaptor no problem.   
The issue is that we'd be dup'ing nodes like crazy.

> Any thoughts?

Yep!  I think tree grammars should by default just return the input  
tree.  If you have -> rewrites then a new subtree for a rule is  
created.  ^ and ^^ would not be allowed and I'd like to disallow !  
also to be consistent.  ! could be useful but mostly likely you would  
want to remvoe a node upon some condition, hence, just saying ID! is  
not that useful.  If you unconditionally knew you didn't want ID in  
the tree, you could have used ID! when creating the tree.  Actually,  
you could also reuse old subtrees too:

expr
	:	^(ADD x=expr y=expr)		// return either subtree or whole thing
			-> {isZero($x)}? $y
			-> {isZero($y)}? $x
			-> $expr
	|	^(MULT x=expr y=expr)
			-> {isZero($x)}? [
	|	ID						// same ID node returned
	;

So every rule would by default terminate with a hidden "-> $rule"

Interesting...do you concur Bernhard?  I should add to blog...

Oh, i just noticed from blog:

http://www.antlr.org/blog/antlr3/treegrammar.tml
> June 28, 2005
>
> Ok, last night I figured out how to handle rewrites during tree  
> parsing. Simple! Just mirror what I do for token rewrite stream but  
> for tree rewrite stream. Queue up a series of rewrites until the  
> end and then alter the tree, giving it back to the user. Some  
> people have mentioned doing this on a rule-by-rule basis for tree  
> construction (during parsing) but this will work great on a grammar  
> basis for tree rewriting too. :) I think it was Paul Lucas and  
> Loring and Stanchfield who were talking about a related thing on a  
> rule basis.
Perhaps you'd also be interested in inserting nodes anywhere in the  
tree when you see, for example, an implicit var def (x=3).  You'd  
want to insert a def for x somewhere above.  You could say

engine.insertAfter($block.getFirstChild(), ^([DECL], [TYPE,"int"], $ID))

or some such.  Hmm...needs more thinking...

Ter