[antlr-interest] Yet another idea for a completly genericTreeParser

Loring Craymer Loring.G.Craymer at jpl.nasa.gov
Mon May 16 01:36:25 PDT 2005


The different approaches have different associated costs and tradeoffs.  At
issue are

1.)  Runtime overhead
	a.)  Performance
	b.)  Memory Usage
2.)  Implementation
3.)  Maintenance
	a.)  Who does it?
	b.)  How well will it be carried out.
4.)  Generality/flexibility
5.)  Capability

Scott's approach wins in 1b, loses in 1a--imposes a constant overhead on
ANTLR programs--ties mine in 2 and 3, and is slightly inferior to mine in 4
and 5 (monolithic implementation versus distributed functionality).

My approach--wrap the tree root with an AST adaptor class, and wrap and
child accessed through an adaptor--imposes no additional overhead to typical
ANTLR use, but has an associated cost of between 10 and 1000 wrapper objects
at runtime (probably under 50) since any object in play within the tree
walker has a wrapper.  That is, it loses in 1b, is second to Prashant's
approach for 1a, and wins (tied with Scott) in 2 and 3.  Prashant's approach
wins in 5 if the implementation supports generating trees with a hybrid
structure.

Prashant's approach wins in 1, since the added overhead is at compile time
(ANTLR invocation) and not runtime.  However, that does not mean his
approach is optimal:  ANTLR is designed for linked sibling lists, not for
child arrays.  The performance differential for sibling lists between the
adaptor approach and the compilation approach will be small.  The
compilation approach has the highest implementation cost--an adaptor will
cost 50-100 lines of code in the typical case, and more for complex tree
structures; Prashant's approach, on the other hand, will touch most
generated code to adapt between tree models (Ter listed the two common
structures; hybrid tree structures are also possible).  Also, the
implementation cost is borne by the ANTLR maintainers (Ter plus the backend
authors), not by the application developer.  Maintenance is in the hands of
the ANTLR maintainers with Prashant's approach.  That translates to a lose
in 3a, unless the ANTLR maintainer has applications depending on the
capability, and also in 3b for the same reason.  This approach probably
loses to the per-AST adaptor approach for 4, although it could break even if
significant effort were put into supporting hybrid tree structures (both
linked sibling lists and child arrays present).

The take home message?  The current level of support is near-optimal, as
surprising as that might seem.

--Loring

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Terence Parr
> Sent: Sunday, May 15, 2005 8:33 AM
> To: ANTLR Interest
> Subject: Re: [antlr-interest] Yet another idea for a completly
> genericTreeParser
> 
> 
> On May 14, 2005, at 11:34 PM, Prashant Deva wrote:
> 
> > I have a sort of real twisted idea on how to make Antlr parse any
> > kind of tree.
> >
> > So lets see,
> > Problem: Currently we cannot parse stuff like xml dom trees since
> > their ast interface differs from the AST interface used by antlr tree
> > parsers.
> >
> > Solution: All the solution suggested till now suggest to somehow adapt
> > the 2 different interfaces.
> > I have got a pretty twisted sort of solution for the whole thing.
> >
> > What if you could change antlr itself to handle exactly your tree's
> > interface?
> 
> Hi Prashant.  I like it. :)  A very interesting twist.  How general
> would it have to be I wonder to be useful.  For example, could it be
> just the method names to use?  Probably not.  I think that if we
> could agree on the available arguments to functions such as the child
> index etc... we could let the user specify templates.  This is one
> nice thing about using StringTemplate for code gen.
> 
> Loring and Monty and I discussed something like this for tokens and
> ASTs at the Oregon ANTLR cabal.  But we thought in terms of what
> properties you could have.  Perhaps we should go further and specify
> the templates used to do common operations.  For example, here's one
> kind of tree access using ST notation:
> 
> getChild(node,index) ::= "node.getChild(index)"
> 
> and here's another:
> 
> getChild(node,index) ::= "node.children[index]"
> 
> Interesting.  In this way there would be no penalty going through a
> tree adapter for example to access the other tree.
> 
> Hmm...I also like no casting...I'm getting mighty tired of casting
> when using data structures I can tell ya!
> 
> Great line of thinking!  Can anybody think of problems?
> 
> Ter



More information about the antlr-interest mailing list