[antlr-interest] CommonTree & Tree grammar versus DIY

Fri Aug 22 13:31:28 PDT 2008

At 11:07 AM 8/21/2008, Terence Parr wrote:

>Yep.  I'm now in favor of manipulating the text instead and
>regenerating the AST for the altered position. Tree manipulation is
>fraught with danger

parse text -> new AST -> ... -> scan AST -> modify text -> repeat

Avoids rewrites entirely.  Nice . . . except where other 
tools/elements are dependent on the AST instance.

For example, in Eclipse, the Java AST is used as the model for the 
code outline view, as the basis for the dependency analysis, 
incremental compilation, and on-the-fly error marking, is referenced 
from the code assist and JavaDoc excerpt resolvers, etc.  Incremental 
AST changes are propagated to listeners with the dependent 
tools/elements figuring out what they need to do to maintain 
integrity and update the display as needed.

If the AST instance is invalidated, say by a complete replacement, 
everything gets recomputed, which is non-trivial and often noticeable 
at the UI level.  Maybe could use an AST delta-detect/delta-apply 
step rather than completely replacing the AST?  Still, a lot of 
overhead if you have to do it in response to every interactive change 
made to the source text.

Even where fraught with danger, sometimes mucking around in the AST 
is the better (or least bad) thing to do.

>>For ad-hoc AST changes, the better approach, at least conceptually,
>>is to implement a low-level structural modification API  with
>>methods to "find" a node based on parameter values, and to similarly
>>create, copy, insert and delete nodes.
>
>Sort of like my TreeWizard that I started.

TreeWizard addresses the situation where you don't have or don't want 
to be bothered with a formal model definition.  The API enables the 
programmer to build-up and manipulate a tree: any tree structure and 
any manipulation, including really bad ones, are allowed.

If you do have a formal model, then you can use the Antlr 
lexer/parser to ensure a valid tree construction.  And if you use an 
Antlr access grammar, you can generate a programmer API that has 
in-built knowledge of that specific model structure and that includes 
protections against doing at least some "loopy" things.

So, yes, "AntlrAccess" would generate a somewhat TreeWizard-like API 
package automatically tailored to a given formal model.