[antlr-interest] Re: summary of trip to Montreal/SableCC land

Mon Nov 15 13:01:55 PST 2004

--- In antlr-interest at yahoogroups.com, Terence Parr <parrt at c...> wrote:

> > The main reason to have separate class for each node is reliability.
> > With type checking you can be sure that every tree that can be 
> > constructed
> > corresponds to syntactically correct input program.
> 
> By "type checking", you mean grammatical structure.  Grammars are 
> particularly good at structure I think you'd agree otherwise you 
> wouldn't be using antlr for generating parsers. ;)
> 
> > After all, tree can be constructed not only by parser, but from XML 
> > serialization or
> > from another tree after some transformation. In this case we will 
> > notice
> > corrupted tree only at run-time.
> 
> You will only notice this at run-time no matter what if you have 
> actions in your grammar that constructs trees.  Only statically 
> specified trees (ala sablecc) can be checked statically.  Seriously 
> though, this is a pretty limiting constraint.  The "meaning" of the 
> input often requires different tree structures than strict syntax would 
> imply else we'd all use parse trees not ASTs.

I understood Alexey's point to be that the use of [properly designed]
heterogenous AST nodes removes the need to [re-]check the structure of
trees during each treewalking phase as tree parsers currently have to.
There is a cost associated with this. Of course hetero-nodes also
better support visitors for those times when they are preferred.

> Further, and more importantly, complex translators require multiple 
> passes over a tree that usually means altering the structure.  Your 
> static checking is gone the minute you jump to actions (whether a 
> grammar or a visitor) to manipulate the tree.  And, w/o actions of 
> course you cannot translate ;)

We are usually transforming from one definite structure to another
equally definite (but perhaps different) structure at each stage.
Depending on the differences between the input and output tree
structures and how many of them we have, I can see that the
heterogenous approach may become unwieldy at some point. I still feel
ANTLR needs to support both.

> > Btw, I begin to understand that separate tree description is
closer to 
> > ANTLR
> > tree parsers than I thought before...
> 
> :)  Hooray!

Except for the fact that separate tree description supports the use of
hetero-nodes much better than ANTLR currently does.

> > But what if I need more that one pass over tree - should I repeat
tree 
> > grammar
> > in each tree walker? I don't like to have same info more than in one 
> > place.
> 
> Agreed.  You have identified something that uses up lots of my spare 
> "CPU" time.  One solution is to simply use a tree grammar to call 
> action methods and then you can subclass the tree parser. Now, you are 
> back to the visitor idea and don't have to repeat the tree, however, 
> this is unsatisfying as I've said.  I believe that we need a model 
> where you can cut/paste a grammar to multiple phases and then push 
> updates to all phases when the structure changes.  In reality, this is 
> called RCS (diff3).  For papers, i'll make up some fancy name ;)

This is a biggie me thinks. I've tried a tool that allows the grammar
to be separated from actions in separate files. A grammar template
file is then merged with an action code from another file to form a
complete ANTLR input file. It works but, it isn't overly satisfying
when done manually ;-(

I feel an editor that hides the distinction between the grammar
template file and the [multiple] action files is needed to make this
work better. ;-)

Micheal
ANTLR/C#

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/