[antlr-interest] CommonTree & Tree grammar versus DIY

Thu Aug 21 11:07:35 PDT 2008

On Aug 20, 2008, at 7:42 PM, Gerald Rosenberg wrote:

> At 12:30 PM 8/16/2008, Terence Parr wrote:
>
>> On Aug 16, 2008, at 12:56 AM, Gerald Rosenberg wrote:
>>> Even with so much existing, this is not a trivial extension to
>>> Antlr.  Theoretically, the end result of any "doSomething()" AST
>>> manipulation could be done with a well-planned set of tree-walkers.
>>> It is just that a random access approach, supported by the
>>> equivalent of findFirst, findNext, findPrev, findLast operations, is
>>> distinctly better than a purely top-down (only getNext allowed)
>>> approach for non-trivial bi-directionally context-dependent AST
>>> rewrite problems.
>>>
>>> Unfortunately, does not sound like AntlrMorph addresses this
>>> particular problem.
>>
>> Not sure I understand the specific problem; can you rephrase?
>
> The problem is the complexity of performing ad-hoc structural  
> modifications on an existing AST.  For example, where the AST is the  
> model in a WYSIWYG editor.  Given a high-level command to make a  
> modification, you need to examine the surrounding context/portions  
> of the AST to determine if it can be made, how and precisely where  
> to make it, and what other modifications are implicitly required in  
> order to preserve the integrity of the model represented by the AST.

Yep.  I'm now in favor of manipulating the text instead and  
regenerating the AST for the altered position. Tree manipulation is  
fraught with danger

> Hand writing AST traversal/modification code is tedious and  
> fragile.  Tree grammars very nicely abstract traversal and rewrite,  
> but are oriented exclusively to top-down, phased rewrites using out- 
> of-band symbol tables to establish state.

yup.

>  And the node rewrite is essentially statically defined by the tree  
> grammar.  Very appropriate for compiler implementation.
>
> For ad-hoc AST changes, the better approach, at least conceptually,  
> is to implement a low-level structural modification API  with  
> methods to "find" a node based on parameter values, and to similarly  
> create, copy, insert and delete nodes.

Sort of like my TreeWizard that I started.

>  Eclipse, for example, has several abstraction layers to enable ad- 
> hoc Java AST modification; the lowest though is basically find,  
> create, copy, insert and delete and the higher levels implement  
> procedural logic to evaluate context and enforce model integrity.   
> Perhaps the greatest impediment to supporting other languages in  
> Eclipse is the required hand coding of these layers.

Meaning there is no generic interface? I guess that makes sense. each  
new language would have different API requirements.

> Antlr could directly generate at least the low-level API.  For  
> example, consider an AST that is the underlying data structure for  
> an HTML editor.  A grammar to generate the desired API  might be  
> specified as:
>
>        access grammar html;
>
>        start_tag : open_tag ID ^( name ^( attr )* )*
>                => find (int start_node, boolean direction, String  
> $ID.text ) returns [int node_index]
>                => find (int start_node, boolean direction, String  
> $ID.text, String name, String attr ) returns [int node_index]
>                => create (String $ID.text, String name, List attr )  
> returns [$start_tag.tree]
>                => copy (int node_index) returns [$start_tag.tree]
>                => insert (int node_index, $start_tag.tree) returns  
> [boolean status]
>                => delete (int node_index) returns [$start_tag.tree]
>        ;
>
> This is not far off from a tree grammar: tersely abstracted, but  
> still providing sufficient information to unambiguously define  
> implementation of the API.  The generated code will be no more  
> fragile than that produced from a tree grammar.  Add in  
> heterogeneous tree node support and it is a rather complete  
> solution.  Non-trivial, but complete.  The devil is in figuring out  
> the appropriate grammar syntax for defining the API productions --  
> what is shown is good for discussion, but probably not much more.

So, ANTLR's job would be to fill in those find/create/... methods? I'm  
not sure he could figure that out from the argument list. Can you  
explain a bit more?

Thanks!
Ter