[antlr-interest] TreeDL way (was: Article against TreeWalkers)

Mon Mar 13 02:33:35 PST 2006

Hi all,

My project, TreeDL (http://treedl.sf.net), is mentioned in the article 
"Manual Tree Walking Is Better Than Tree Grammars" 
by Andy Tripp. I generally agree with this statement
but can't agree with the context:
>The mapping from one AST structure to another is quite complex
>(see XSLT, TXL, or TreeDL for example), and a single BNF-like 
>grammar doesn't capture it.

TreeDL is not for AST transformations, but for AST description
(as a structure of heterogenous tree) and for generation of code
from this AST description. It generates classes for AST node types
and for tree processing: base visitors, walkers, factories etc
- all code that is defined by AST structure.

We've compared homogenous vs heterogenous trees and
ways of tree processing in this list long time ago, just after 
"Translators Should Use Tree Grammars" article was published:
http://www.antlr.org/pipermail/antlr-interest/2004-November/thread.html

Mentioned in this article advantages of tree grammars 
(over visitors and other approaches to tree processing) exist only if
tree grammars have good tool support and other approaches don't have one.
But homogenous trees processing without tool support are much more dangerous
than heterogenous trees processing!

So, why to write additional AST description?

1. It is interface to results of parsing. Not all developers
need to look at antlr parser - with predicates, error recovery,
tree construction code (btw, it is most complicated part of antlr language 
- count percent of questions about tree construction in this list)
AST description is very clear: 
http://treedl.sourceforge.net/frontend/java5/treedl/com/unitesk/java5/Java5.tdl-xref/index.html

2a. It is source of code generation. You need not write depth-first
walker by hands and update it on each AST change - just re-generate it.

2b. It allows additional checking of tree processing code to compensate
well-known disadvantages (of visitors). For example, it is possible to
define operation over tree:

operation Object compute( virtual Expression expr, SymbolTable table )
{
  case( MinusExpression expr, table ): 
  { 
    return   (Integer)compute( expr.getLeft() )
           - (Integer)compute( expr.getRight() ); 
  }

  case( ConditionalExpression expr, table ): 
  { 
    return   (Boolean)compute( expr.getConditional() )
           ?   compute( expr.getThen() )
             : compute( expr.getElse() ); 
  }
  case( NameExpression expr, table )
  {
    return table.getValue( expr.getName() );
  }
  ...
}

Tool will check that operation case is defined for each inheritor of Expression.
So, when new expression type will be added, you will get reminder that
compute() operation should be updated. Note, in contrast to visitors
you can define additional parameters and result type.

3. Generated tree can implement any tree interface you need, including
homogenous interface (for example, ANTLR Node). It allows to process
tree using various approaches.

4. Tree can be created without parser. For example, it can be model in MVC pattern.
And you can use the same text generation library (StringTemplate?) for this model.

Analogies of TreeDL in XML world are binding APIs that create
classes for given schema to represent XML documents.
I don't know is there binding API that generate something in addition to node classes.
XML parser can be generated automaticly (advantage of standard syntax)
and we need to write parser with actions for tree creation.
Luckily, it is very straightforward code.

Regards,
Alexey

-----
Alexey Demakov
TreeDL: Tree Description Language: http://treedl.sourceforge.net
RedVerst Group: http://www.unitesk.com