[antlr-interest] blog entry to new tree stuff

Loring Craymer Loring.G.Craymer at jpl.nasa.gov
Wed Jun 8 18:40:13 PDT 2005


At 04:11 PM 6/8/2005, Terence Parr wrote:
>Howdy folks,
>
>I'm starting to think about how to build trees, do rewrites, parse
>trees, etc...

I'm about 70-90% through an implementation that incorporates the lessons 
learned from the 2.8 implementation and integrates a very clean attribute 
syntax (simplified from my attributes proposal 
--http://www.antlr.org/article/1116319813550/AttributeUnification.pdf -- so 
that attributes and tree constructors are truly first class syntactic 
elements).  The tree construction engine should be very fast (it is 
surprisingly simple) and the tree rewrite capability will be awesome, with 
both full translocation support and node relabeling supported by the 
attribute syntax.  [I should be able to release it without the hassles I 
had for 2.8--none of the work was done on JPL time nor with JPL equipment.]

I still have to put in some more thought on semantic predicates and 
integrating the StringTemplate output stuff with the attribute syntax, but 
what I have so far goes a long ways towards removing any target-language 
dependence from grammars.


>http://www.antlr.org/blog/antlr3/trees.tml

Curiously, the adaptor looks like an integrated ASTPair/ASTFactory in terms 
of construction capability, and a visitor for tree walking.  As currently 
proposed, this will be slower than ANTLR 2 both for construction and for 
tree walking, and add no new capabilities.  However, things will certainly 
change when you add in support for the rewrite syntax--either the analysis 
engine behind the code generator will be beefed up or a different 
construction approach adopted.

The one advantage of a tree implementation using vectors of children is for 
lookahead tests (you can, say, just look at the fifth element of a vector 
to see if this is the "right" alternative to walk down rather than 
navigating the first four to get to the fifth).  However, k=1 is a good 
goal to meet when designing a tree structure, so I do not see the 
shortcircuited lookahead as being a particular advantage.  Considering that 
access to a vector element is slower (due to the index computation) than 
navigating a sibling pointer, I see this as a retrograde design 
choice.  Vectors are also likely to be a nuisance for languages other than 
Java--in C, for example, one would prefer to allocate a fixed-length 
array--but I expect that you will move to an implementation that allows you 
to calculate array length before populating the array.


>More work tomorrow on it.
>
>As always, remember that this stuff will be changing a lot for a
>while...for example, I might not use this TreeAdaptor thing.

In conjunction with the vectors of children, it only adds about 100% 
overhead for tree navigation (three fetches instead of two to access 
"nextSibling" for the vector element access plus the extra fetch through 
the adaptor; could be less if Java is able to keep the index value in a 
register).  As I pointed out to Scott, the benefits of an adaptor are 
illusory:  wrapping the root node of a tree before the walk gives 
equivalent or better capability (single adaptor has to know about all tree 
node navigation structures) at minimal cost--perhaps as many as 100 wrapper 
nodes active versus the single adaptor--and that overhead is only incurred 
in the exceptional case where you wrap a tree before walking it.

--Loring


>I thank everyone for their suggestions and ideas from which I have
>derived a lot of this stuff (with some twists of my own, naturally).
>
>Thanks,
>Ter
>--
>CS Professor & Grad Director, University of San Francisco
>Creator, ANTLR Parser Generator, http://www.antlr.org
>Cofounder, http://www.jguru.com




More information about the antlr-interest mailing list