[antlr-interest] Re: Anyone tried this ANTLR-inspired CC?

Tue Nov 11 11:46:50 PST 2003

--- In antlr-interest at yahoogroups.com, "Tiller, Michael (M.M.)" <mtiller at f...> wrote:
...
> Personally, there are two extremes I would like to avoid.  One is using terminals as roots in ASTs.  I don't really care for this 
approach because I find it rather strange (just my personal preference).  The other case I wouldn't like is having every production rule 
in my grammar generate a node in my AST.  As far as I'm concerned, these are both sub-optimal.
> 
> In practice, ANTLR tends to fall into the former category and I guess JJTree probably falls into the latter.  Now ANLTR is obviously 
capable of avoiding this use of terminals for roots by creating imaginary nodes.  It seems to me that the issue here is how tedious it 
can be to do this (and the fact that it really becomes a problem with the current C++ framework).  I think it would be nice if ANTLR 
was capable of automatically handling this case of synthetic or imaginary nodes and I suspect that is what is at issue here.

Mike--

If you are doing little or no transformation, then parse trees are convenient because the parser grammar specifies how the input is to 
be interpreted.  If you are doing transformation, then the non-terminals in a parse tree serve to "remember" the input (parser) grammar 
and quite often impede the transformation process.  Imaginary nodes are fine, if they make sense in the output grammar, but should 
not be inserted will he/nill he and are best avoided unless they make either make sense in the output grammar or simplify 
recognition (avoid syntactic predicates in the tree walker grammar).  Another way of saying this is that imaginary nodes should reflect 
output semantics, not input semantics.

Ter's primary innovations with SORCERER (tree walkers) were 1.) the idea of specifying tree construction from annotation of the 
parser grammar, and 2.)  tree walker grammars to recognize transformation stages.  Prior to that, parse trees were the rule; if you 
look through the literature, you can see that the thinking was "generate output from the parse tree" directly and most of the machinery 
is awkward to use and limited in capability because it attempts to do things in one pass.

Where parse trees have value is as an output form.  You can print directly from a parse tree, and you can feed parse trees to compute 
engines that expect one.  At the moment, for example, I'm using ANTLR to take an input expression language and convert it into a 
parse tree for use with the Python compiler routines.

As to the "imaginary" versus parser-terminal roots:  my experience is that there is usually enough syntactic sugar in 
languages--parentheses on function calls, for example--to use an existing token as a root, possibly changing its text and token type.  
An important design criterion for tree construction during transformation is to remove extraneous elements ("noise"), not to add new 
ones, so caution is necessary in inserting imaginary nodes..

I am inclined to support parse tree construction in 2.8 because of their value as an output form.  That way, I can automatically 
generate the parse tree grammar for an output language and use that as a target for transformation of an input language grammar.

--Loring

> --
> Mike

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/