[antlr-interest] CommonTree & Tree grammar versus DIY

Sat Aug 16 00:56:51 PDT 2008

At 06:33 PM 8/15/2008, Szczepan =?iso-8859-2?q?Ho=B3yszewski?= wrote:
>Andy Tripp wrote:
>
> > Right. Whoever writes the doSomething() method shown above is going to
> > have to know what the AST looks like, regardless of whether the
> > doSomething() call is embedded in a treewalker.g file or plain old code.
>
>That's exactly the problem. Whoever writes the doSomething() for a processing
>pass that is only interested in 3% of the language constructs, will have not
>only to know, but also to painstakingly express with tree rules, what the AST
>can possibly look like from the start symbol to wherever the relevant
>constructs can appear in the grammar. This essentially duplicates a large
>chunk of the parser grammar. There is no simple way to tell ANTLR to
>doSomething() e.g. for each type definition, regardless of whether it occurs
>at top level, in a class, in a function, in an anonymous code block buried
>deep in a lambda expression, or in that fifth possible place where the draft
>0.0.2 specification will allow type definitions to occur.
>
>Szczepan Holyszewski

Exactly correct.  Except ... almost all of the machinery needed to 
implement "doSomething()" as an operation on an AST already exists in Antlr!

Most any "doSomething()" can be composed from a well-defined API set 
of AST node-oriented operations, such as create, copy, insert, 
delete, and find.

Some if not most of  the needed atomic AST operations already exist 
in the org.antlr.runtime.tree.TreeWizard hierarchy (create(Token), 
dupNode(Object), dupTree(Object), index, getParent, setParent, 
getChild, setChild, addChild, replaceChildren, etc).

What is missing is a code generation step that would generate the 
bodies of the create, copy, insert, delete, and find methods specific 
to a particular AST node structure.  This is basically the same 
operation used in the generation of a tree-walker.

Tree-walker code is generated using ASTTreeParser.stg using a custom 
tree grammar to define static AST rewrites.  The implementing code 
for AST node create, copy, insert, delete, and find methods would be 
generated based on an analogous "access grammar" and corresponding 
ASTAccessParser.stg.  As with the tree grammar, the access grammar 
would identify the nodes of interest and their relevant structure.

Even with so much existing, this is not a trivial extension to 
Antlr.  Theoretically, the end result of any "doSomething()" AST 
manipulation could be done with a well-planned set of 
tree-walkers.  It is just that a random access approach, supported by 
the equivalent of findFirst, findNext, findPrev, findLast operations, 
is distinctly better than a purely top-down (only getNext allowed) 
approach for non-trivial bi-directionally context-dependent AST 
rewrite problems.

Unfortunately, does not sound like AntlrMorph addresses this 
particular problem.

Ter, do you have another graduate student looking for a project?  Or, 
if I finish my implementation, can I get Ph.D. credits? ;)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080816/f7ccb22a/attachment.html