[antlr-interest] TreeDL (was: AST specification and processing)

Fri Oct 15 02:50:36 PDT 2004

Hi John,

Thank you for attention to my long message :)

> > How to split translator into subsystems, how to specify interfaces
> > between them?  Natural decomposition of translator is around internal
> > representation of input data, i.e. around AST: Parser checks syntax and
> > builds AST, semantics checker verifies static semantics and adds
> > additional information to AST.  After that AST can be transformed,
> > processed in other way, output code can be generated.  All of these
> > subsystems depends on AST format only, they are (theoretically)
> > independent of each other.
> 
> But they aren't, a priori, independent of each other.  Transformations,
> even while preserving the basic structure of the tree (i.e., adhering to
> the same grammar) are often order dependent.  Also, many tree
> walkings/manipulations accumulate and/or modify information stored outside
> the trees.

O'key, I mean implementations of transformations are independent. 
Pre- and post-conditions for each transformation can be formulated
in terms of tree state - possible node types, required additional information 
(for example, reference-definition links). Information stored outside tree can be considered
as attribute of tree in whole, and can be referenced, for example, from root node,
so it is not a problem.

To specify a contract for each action over tree it is enough to describe tree structure
and state of attributes. 

> > If there is separate AST specification (not in ANTLR parser), all of
> > these subsystems (including parser) can be developed independently and at
> > the same time. AST specification can be used as a contract between
> > developers of different subsystems. According to my experience, it speeds
> > up development and reduces number of errors.
> 
> Sure (modulo taking the points I made above into account).
> 
> > I propose the notation called TreeDL to describe tree-like structures,
> > open-source tool that checks consistence of tree description and
> > translates tree description to a set of classes (now in Java, C# will be
> > added in near future). The TreeDL tool also can generate HTML
> > cross-referenced version of tree description to be used as documentation.
> > Tree nodes can be decorated by dynamic attributes to store additional
> > information.  There is powerful template engine to generate code from
> > tree.  Also there is framework for rapid tool development - library for
> > error reporting, functionality extention by plugins - TreeDL tool itself
> > uses tree description in TreeDL, mentioned template engine and framework,
> > so source code can be used as an example.
> 
> So, are you saying that TreeDL is e.g., a replacement for Antlr tree
> handling?  Basically, what's the real point of TreeDL?  I.e., what doesn't
> Antlr's tree handling do that you want?

As far as I know, in ANTLR there is no way to sepatate tree definition from parser.
It is harder for developers of tree processing than if they have clear and full description of tree structure.

So, the main point of TreeDL is to provide the notation for specification of tree structure.
For example, most of our developers are not familiar with ANTLR and they can't understand
a structure of tree that ANTLR builds - it is required to read all parser definition.
But it is unnesesary, because the only thing they should know - what kinds of nodes are used,
what children and attributes are defined for each node.

Moreover, TreeDL tree can be constructed even without parser. In one of our projects 
we build TreeDL tree from GUI and use the same codegeneration library.
May be I am "tree fan" as Terence is "grammar fan" :)
As my colleague speaks: "Not because I have no other ideas, but because this is the one I like most".
(Hm, not very easy to translate to English...)

> > Additional docs and downloads are at http://treedl.sourceforge.net
> > Example tree description
> > http://treedl.sourceforge.net/treedl/treedl/com/unitesk/atp/treedl/TreeDL.tdl-xref/index.html
> > TreeDL BNF grammar
> > http://treedl.sourceforge.net/treedl/bnf/TreeDL.bnf/index.html 
> > TreeDL language description http://treedl.sourceforge.net/treedl/treedl_en.html
> > TreeDL tool description
> > http://treedl.sourceforge.net/treedl/treedl_tool_en.html
> > Template engine usage example
> > http://treedl.sourceforge.net/treedl/xref/com/unitesk/atp/treedl/JavaNodeGenerator.html
> 
> Am I reading the docs correctly... Are you really using the visitor pattern
> rather than generating a recursive descent parser?

TreeDL notation itself doesn't define what is generated from tree description. TreeDL tool provides
default implementation but it is easy to modify translation scheme, to change what should be generated or
generate additional information you need.

Historically codegeneration library use visitor pattern, but it is not required for all TreeDL users to use it also.
It is not clear for me, what task requires recursive descent parser? Do you mean tree walker?
(Plugin that generate tree walker can be written in several hours.)
TreeDL tool doesn't do ANTLR job - ANTLR is used for parsing. Actions that create TreeDL tree nodes 
should be inserted in ANTLR parser rules. Of course, this code is longer than the use of ANTLR tree building,
but easy to write, understand and maintain (You see, there are rather many questions in this list about problems
with ANTLR tree building). And tree description guarantees that correct tree will be built - node constructor checks
that all required children and attributes are provided, and additional checks of its values can be defined.

I plan to write small examples on how to write formal text processing tool using ANTLR and TreeDL,
are members of this list interested to read them?

Another tool that we've developed is caled BNF tool. It takes grammar in BNF form, check correctness 
(well, it is simple - all used rules should be defined and reachable from start rule) and generates positive/negative tests,
i.e. sentences that a priory satisfy/unsatisfy grammar. Test can be generated from any context-free grammar definition.
It is useful when source grammar should be adapted to be suitable for ANTLR - we need some checking that 
source and modified grammars are describing the same language.

Also HTML version of grammar with cross-references can be generated. This is example:
http://treedl.sourceforge.net/treedl/bnf/TreeDL.bnf/index.html

Now BNF tool is not public available, but in the future it is possible.

Thanks,
Alexey
-----
TreeDL: Tree Description Language: http://treedl.sourceforge.net
RedVerst Group: http://www.unitesk.com

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/