[antlr-interest] Re: Translators Should Use Tree Grammars

Mon Nov 22 19:21:43 PST 2004

--- In antlr-interest at yahoogroups.com, Monty Zukowski <monty at c...> wrote:
> Oliver Zeigermann wrote:
> > On Mon, 22 Nov 2004 21:16:12 -0000, atripp54321 <atripp at c...> wrote:
> > 
> >>Only if a top-down walk of the tree is the basis of the
> >>code that you need. If your code is something else (like
> >>applying a series of pattern-matching rules), then
> >>you have to write all the code even when using a tree grammar.
> > 
> > 
> > I think this is a strong argument. When you think of natural language
> > processing (or the translation Andy does) there hardly is
> > one-after-the-other processing.

Let me try to firm up this discussion with a few points.  First of
all, tree walkers do pattern matching, but they do context-dependent
pattern matching rather than context-independent pattern matching. 
There are some problems for which context-independent pattern
transformation is the right approach; natural language processing is
not one of them.  There is the classic joke about early computerized
translation.  The Biblical line that "The spirit is willing, but the
flesh is weak" when translated from English to Russian and back came
back as "The Vodka is good, but the meat is rotten".

You often have to be pretty smart when implementing
context-independent pattern matching.  Note all of the effort that
went into devising algorithms for finding substrings in
text--nowadays, that is commonly solved with the Boyer-Moore algorithm
and not by brute force.  For pattern matching in trees, the entire
tree is walked as a matter of course:  the only real savings a
context-independent pattern matcher has over a tree walker is in the
intellectual effort of figuring out the top-down tree structure.

That said, context-independent pattern matching and transformation can
be quite useful.  BURG is very useful for doing peephole optimization
in generating code.

> I found when doing my AREV translator that although I thought of my 
> translations as being very specific -- translate AREV case statements 
> into VB if-else statements, or translate dynamic array assignments into 
> function calls -- I would simply implement them by overriding the few 
> rules I needed to after subclassing from my tree grammar superclass. 
> This let me do the usual tree grammar tricks like keeping track of the 
> context.  In one case I had to distinguish between the root of an 
> expression and sub-expressions -- this was easy in my tree grammar.

i. e:  Monty took as much advantage of context information as he could.

> This may have not been very efficient since I was walking and building 
> trees for every other rule even though I didn't need to.  The building 
> far outweighs the walking, and Loring has been addressing that issue 
> with his new tree code.
> 
> The dominant academic approach seems to be to let you define a "rule 
> soup" and the engine figures out which rules must happen before which 
> other rules.  This prolog-type approach leads to very hard to debug 
> translators.

That is: to determine which transformations should not be appled in a
given context.  The rule-based systems tend to be
computation-intensive since they work by applying transformations in
sequence, then repeating until there are no phrases left which could
be transformed (I'm ignoring some optimization here--this is the
purely brute-force version).

> I agree that pattern-matching with ANTLR grammars is not fun, because 
> you do the actual testing in actions or semantic predicates.  However, 
> this is actually implemented as a tree walker in the BaseAST or 
> CommonAST findAll and findAllPartial methods.
> 
> In the future ANTLR could have better support of pattern matching of 
> trees.  I can't remember any specifics right now, but I'd be surprised 
> if Loring hasn't already addressed it with the 2.8/3.0 tree stuff.

No, I haven't added anything special for pattern matching.  The
automated tree grammar generation lessens the need--it provides the
interactive feedback needed to deal with tree transformation as a
design problem, not just to construct trees that can be walked via a
tree grammar.  That makes ANTLR's context-dependent pattern matching
more powerful, and the context-independent findAll routines
essentially irrelevant.

--Loring

> The thing is, once I've matched a pattern I usually want to do a 
> transformation on it.  A natural way to do that is with templates --
one 
> to match, the other to transform.  Please see 
> http://www.erights.org/elang/quasi/overview.html for some examples (I 
> helped flesh out some implementation details of those but it's really 
> the work of Mark Miller).  The pattern can be written as source
language 
> code with special markers for generic expressions or whatnot.  The 
> translation patter could be written in code as well and translated into 
> a tree (assuming a parser exists) with appropriate parts copied from
the 
> original tree to the new tree.
> 
> The main point is that I still want to specify the order of my 
> translations.  There are some directions ANTLR could go toward making 
> the specification of translations easier, namely pattern matching and 
> substitution.
> 
> Wish I had more time to flesh this out, maybe tomorrow.
> 
> Monty
> http://www.codetransform.com

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/