[antlr-interest] Re: another way to preserve token order in ASTs

thrutchy eric_mahurin at yahoo.com
Sat Jul 31 18:42:54 PDT 2004


Yep, already read that.  I like the method I have below because:

- changes were minimal (override one method in an AST)
- AST nodes need no additional data (but I need one extra node per
root - you could argue which is best)
- I knew how to also preserve whitespace
- it made more sense to me

I think the way I really would like to have done this would be to have
a "left" children pointer in addition to current children pointer and
sibling pointer.  With AST nodes like that, these rules:

x : M y N ;
y : A B C^ D E ;

would come out like this:

M -> C -> N
   /   \
  A->B  D->E

I didn't do the above because one line would need to be changed in
ASTFactory.java (no functional change for normal AST nodes).  I could
have subclassed and found a way to use the new class, but didn't want
to bother.  Instead, in the solution below, I'm settling with:

M -> C -> N
     |
     A->B->invalid->D->E

The invalid token type node is used to tell where the root node would
go if it wasn't rooted.  I guess I could optimize this out in the case
when the root node is the first token in the rule.

I'm beginning to get concerned with memory usage in the AST's and will
be developing my own AST classes that have only what I need in the
smallest footprint (i.e. maybe use bytes instead of chars, no
line/column, collapse hidden tokens, etc).

Eric

--- In antlr-interest at yahoogroups.com, Terence Parr <parrt at c...> wrote:
> Have you seen the article:
> 
> http://www.antlr.org/article/preserving.token.order/ 
> preserving.token.order.tml
> 
> called "Preserving Original Token Sequence In ASTs"?
> 
> It has a pretty slick way of dealing with this issue :)
> 
> Ter
> 
> 
> On Jul 31, 2004, at 3:50 PM, thrutchy wrote:
> 
> > I just coded up a very simple way to preserve token order in the AST's
> > - put in a dummy node where the root would have gone.  What you do is
> > override the AST addChild method (used while making a root) to put the
> > child at the front with a null node behind it, instead of just adding
> > it in at the end (roots don't have any children initially).  You can
> > do this to any of the BaseAST classes, but I did it to
> > CommonASTWithHiddenTokens:
> >
> > import antlr.collections.AST;
> >
> > public class CommonAST2WithHiddenTokens extends
> > CommonASTWithHiddenTokens {
> >     protected AST left = null;
> >     public void addChild(AST node) {
> >         AST rootMarker = new CommonAST(); // Token.INVALID_TYPE
> >         rootMarker.setNextSibling(this.down);
> >         if (node != null) {
> >             this.down = (BaseAST)node;
> >             while (node.getNextSibling() != null) {
> >                 node = node.getNextSibling();
> >             }
> >             node.setNextSibling((BaseAST)rootMarker);
> >         } else {
> >             this.down = (BaseAST)rootMarker;
> >         }
> >     }
> > }
> >
> >
> > Then my AST print routines looked like this:
> >
> >     public static void printHiddenBefore(AST a) {
> >         if (a==null) return;
> >         AST leftChild = a.getFirstChild();
> >         if (leftChild!=null &&  
> > leftChild.getType()!=Token.INVALID_TYPE) {
> >             a = leftChild;
> >         }
> >         java.util.List before = new ArrayList();
> >         for (
> >             CommonHiddenStreamToken t =
> > ((CommonAST2WithHiddenTokens)a).getHiddenBefore() ;
> >             t!=null ;
> >             t = t.getHiddenBefore() )
> >         {
> >             before.add(t.getText());
> >         }
> >         for (int i=before.size()-1;i>=0;--i) {
> >             System.out.print(before.get(i));
> >         }
> >     }
> >
> >     public static AST printAST(AST a) {
> >         while (a!=null) {
> >             if (a.getType()==Token.INVALID_TYPE) {
> >                 a = a.getNextSibling();
> >                 return a;
> >             }
> >             AST leftChild = a.getFirstChild();
> >             AST rightChild = printAST(leftChild);
> >             String s = a.toString();
> >             if (s!=null) System.out.print(s);
> >             for (
> >                 CommonHiddenStreamToken t =
> > ((CommonAST2WithHiddenTokens)a).getHiddenAfter() ;
> >                 t!=null ;
> >                 t = t.getHiddenAfter() )
> >             {
> >                 System.out.print(t.getText());
> >             }
> >             printAST(rightChild);
> >             a = a.getNextSibling();
> >         }
> >         return null;
> >     }
> >
> > I toyed around with a few other ideas (add a "left" AST pointer, add a
> > dummy root node for the children to the left of the root), but those
> > required changing one line the ASTFactory class (or extending it and
> > overriding stuff).  For those interested, it was line 361 in the
> > makeASTRoot method.  It would have been nice to have this:
> >
> >             currentAST.child = root.getFirstChild();
> >
> > instead of:
> >
> >             currentAST.child = currentAST.root;
> >
> > I don't believe this would have broken compatability, but it would
> > have allowed something very different to be done in AST addChild  
> > method.
> >
> > Eric
> >
> >
> >
> >
> >
> > Yahoo! Groups Links
> >
> >
> >
> >
> >
> >
> --
> CS Professor & Grad Director, University of San Francisco
> Creator, ANTLR Parser Generator, http://www.antlr.org
> Cofounder, http://www.jguru.com
> Cofounder, http://www.knowspam.net enjoy email again!
> Cofounder, http://www.peerscope.com pure link sharing



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list