[antlr-interest] Issues with conditional tree rewrites

Mike Olson molson at the-olsons.net
Thu Mar 18 12:30:21 PDT 2010


Hello,

 I'm trying to rewrite an AST into a tree of my own object types (using 
TOKEN<MyClassName>) and long they way do some semantic validation.

 I was having great luck until I ran into this issue...which I think my 
be a bug.

 I have an incoming Tree that looks like this:

10(COLLECTION_LITERAL) 'Sequence'
 53(IMPLICIT_VARIABLE_REFERENCE) 'self'
 37(BOOLEAN_LITERAL) 'true'

This is my own little dump routine, what it is saying is I have a Token 
(of type COLLECTION_LITERAL) called "Sequence, with 2 children an 
IMPLICIT_VARIABLE_REFERENCE called self and a BOOLEAN_LITERAL called 
'true'.

I'm using the following rewrite rule to handle this:

   | ^(c=COLLECTION_LITERAL (e=exp)* )
               -> ^(EXPRESSION<CollectionLiteralAst>[$c.token,$c.text] $e*)

My result is:

10(COLLECTION_LITERAL) 'Collection[Sequence]'
   37(BOOLEAN_LITERAL) 'Boolean'

only the last of the original "exp"s is added to the new tree.

Because I own the class "CollectionLiteralAst" I overrode the "addChild" 
method to dump all calls to addChild, here is what I see when this 
production is matched:

"""
Adding Child: nil as class CommonTree
Post Add Count: 0
Adding Child: Boolean as class BooleanLiteralAst
Post Add Count: 1
"""

 From this, I see that addChild is called once with "nil", but that 
happens every time a sub tree is created (from what I can tell) and once 
for the "second" child in the original tree...

Just a note, when I test it with only 1 child it works fine.  When I 
test it with 3 children, only the last child is added.

I started looking into the generated code....once I got through the 
"matching" logic, I see the rewrite logic

"""
                       // parser\\My.g:101:20: ^( 
EXPRESSION[$c.token,$c.text] ( $e)* )
                       {
                       CommonTree root_1 = (CommonTree)adaptor.nil();
                       root_1 = (CommonTree)adaptor.becomeRoot(new 
CollectionLiteralAst(EXPRESSION, c.token, (c!=null?c.getText():null)), 
root_1);

                       // parser\\MyAst.g:101:77: ( $e)*
                       while ( stream_e.hasNext() ) {
                           adaptor.addChild(root_1, stream_e.nextTree());

                       }
                       stream_e.reset();

                       adaptor.addChild(root_0, root_1);
                       }
"""

I have stepped through the logic in  a debugger and "stream_e.hasNext()" 
only returns true the first time and "e.nextTree()" always returns the 
last of the children.

I then started looking above this in the generated code to see where 
"stream_e" comes from and found...

"""
                   RewriteRuleSubtreeStream stream_e=new 
RewriteRuleSubtreeStream(adaptor,"rule e",e!=null?e.tree:null);
"""

It looks like it is initialized from the value of "e.tree"....so here 
did that come from...a bit farther up and we see

"""
                       do {

///I CUT OUT A BUNCH OF STUFF HERE

                           switch (alt3) {
                           case 1 :
                               _last = (CommonTree)input.LT(1);
                               
pushFollow(FOLLOW_exp_in_literalExpression472);
                               e=exp();

                           default :
                               break loop3;
                           }
                       } while (true);
"""

 From this, it looks like that as it is trying to "match" my expression, 
"e" is set to every (rewritten) child in the source tree.  Once this 
loop ends, e is set to the last child...this explains why stream_e 
always returns only one result.

It would seem to me that this is a very common operation, I googled 
around a bit but could not find a related issue.

Please let me know if there are any know work arounds, or if I am just 
way off my rocker.

Thanks
Mike

-- 
Mike Olson



More information about the antlr-interest mailing list