[antlr-interest] philosophy about translation

Tue Oct 10 21:50:54 PDT 2006

I have off and on put a fair amount of thought into rule-based pattern match-and-replace systems and how they might relate to ANTLR.  ANTLR trees with transformations are like a surgeon's scalpel:  you can locate precisely which parts of a tree need to be changed and change them in situ.  On the other hand, trees can be a pain when you really do want to replace all instances of a given pattern. Pattern-match-and-replace systems are more of a "shotgun" approach.  Every instance of a pattern is changed, and the order in which transformations are made affects the end result.  For example

A B C -> C B A
B C D -> C B D

operating on A B C D will produce either
C B A D or A C B D depending on which rule is used.  Probstang's BURG algorithm provides an approach for finding the "best" transformation from a set of possible transformations, but figuring out how to integrate BURG with ANTLR is a challenge.

As larger and larger rule sets are built, it becomes harder and harder to be sure that you are getting the results that you want--unless you can make sure that the rules do not overlap.  I think that that is one of the reasons that interactive refactoring tools are gaining wide acceptance, but batch transformation systems have had limited acceptance.

For ANTLR use, there is an additional problem:  what if you want to follow rule-based transformation with grammar-based tree manipulation/analysis?

There is one possible solution:  instead of transforming the tree generated by ANTLR, do a pattern-match-and-replace on the grammar instead.  Of course, what would really be done is is to generate an annotated grammar that would produce the same transformation on an input tree that would be achieved by the match-and-replace algorithm as well as the output grammar.  (It would actually be a bit more complicated because some of the matches would cross rule boundaries.)  Then it is at least possible to examine the grammars to see if the transformations are free of "accidents".  A useful side effect:  runtime transformation does not involve "iterate until no further transformations can be applied".

I have yet to work on a problem where multi-pass rule-based replacement was a clear win, so I have not gone further than thinking about this.

--Loring

Monty Zukowski <monty at codetransform.com> wrote: > * Ease of specification.  Rules are clearly easier than a tree walker
> when they are clean and there are few of them.  How can one beat the
> following?
>
> ADD v1 TO v2. --> v2 += v1;
>
> Though with trees, one could perhaps auto translate that text to
> trees.   Or, just use raw:
>
> ^(ADD v1 v2) -> ^(PLUS_EQUAL v2 v1)
>

Note that the E language guys came up with a nice shorthand described
tersely here:  http://www.erights.org/elang/grammar/quasi-terms.html
and http://www.erights.org/elang/grammar/quasi-xml.html.  Basically it
lets you parse something like ADD @v1:expr TO @v2:expr into a tree
that can be matched against your source tree.  Then you can write the
--> @v2 += @v1 on the other side, which is parsed into a tree and it
will substitute the AST nodes appropriately,   So basically you are
specifying in text yet working in trees.

Ter, I think we discussed this at the Cabal.  The trick is that the
parser and lexer needs to know how to just return a generic EXPR token
for @v1:expr -- and build a tree with that as a placeholder.

> Personally,  I think that a hybrid system that does most of the work
> in a tree walker but with rules, for all of the special cases that
> Andy has pointed out, would be great!
>

Yes, I agree.  This is quite an interesting discussion.

Monty

---------------------------------
Do you Yahoo!?
 Get on board. You're invited to try the new Yahoo! Mail.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20061010/8697b10c/attachment-0001.html