[antlr-interest] philosophy about translation

Fri Oct 6 11:23:47 PDT 2006

On Oct 5, 2006, at 1:29 PM, Andy Tripp wrote:
> In COBOL we have statements like:
> ADD A TO B    // B += A;
> ADD A B TO C D   // C+= A + B;  D+= A + B;
> ADD A TO B GIVING C    // C = A + B;
>
> If you bifurcate at the statement level, then you have lots of  
> logic that
> says "Here is the COBOL ADD statement, and now I'll generate the  
> equivalent Java
> statement, and either replace the COBOL AST with the Java one, or just
> somehow just attach the Java AST to the COBOL AST."

Well, logic or alternatives within a rule.

> That's fine, but it just means that (almost) all your logic is  
> there, in that processing.
> The fact that it's stored in an AST at all is of little help to  
> you...you're not doing
> many AST manipulations. So the AST just becomes a convenient data  
> structure
> for storing the state between phases,

Correct.

> as opposed to a convenient data structure
> for actually performing language translation on.

  well you have to store the data in some way and recording the  
grammatical structure by encoding it in the structure the tree seems  
useful regardless of what you're doing.  Better than a linked list of  
the input tokens right?   I guess you are saying you don't really  
care about the data structure because you are doing a declarative  
rule-based translation.

   I agree that the declarative approach works best in many  
situations.  On the other hand, I am not sure that what you say in  
your article is that bad: "part of your work is being done with  
treewalking, and part is done at the end."

  one of my big problems though with the declarative approach is that  
you can easily generate an infinite loop.  I believe this is nicely  
formalized in many of the papers done by the declarative approach  
people and is the raison d'etre for Stratego that tries to let you  
specify what the order of evaluation is for applying rules.  You can  
clearly get a situation where you loop forever.  Of course, I suppose  
that you can simply look for where the loop is by tracing the rule  
applications, but anyone who is programmed in prolog says that it's  
very difficult to debug when your list of rules doesn't work.

>> I was raving about this like 7 years ago, it just totally rocks!
>> Check the archives for my posts about multiple tree grammars, or ask
>> questions if something isn't clear.
>>
>> By the last pass, I had a completely vb tree, and then I finally
>> dumped it to text.
>
> I had looked very carefully at all your stuff when I started 4  
> years ago.
> My feeling is that if you're going to do a "natural" translation -  
> that is:
>
> String hello = "hello";
> String world = "world";
> printf("%s %s\n", hello, world);
>
> ...becomes...
>
> System.out.println("Hello World");
>
> then the "walking the AST" approach doesn't come close to working.
> The two ASTs for those two code chunks
> have almost nothing in common, and doing that translation
> is 1% a "tree-manipulation" problem, and 99% a "code mapping" problem.

I think I disagree.  Somehow  you have to find these patterns whether  
you're looking at the symbol table that was previously populated or  
you're walking looking at the flow trying to figure out the variables  
are.  Just for the record I am opposed to the translation you show  
here.  That is not at all what I would expect unless it's clear that  
those variables are always constants.

> I think if tree-walking works for most of the translation work, you  
> either
> have two very similar languages, or your output code looks just  
> like your
> input code with different syntax. "I don't want 'JOBOL'", as one of my
> customers said :)

heh heh heh...yes, very interesting point.  I was very happy with my  
annotate the tree approach for mantra which yields like

     mlist c = mArrayList.of((
     new Object() {
       public mstream value(final mstream _input) {
         mint i = new mint(0);
         mlist _data = new mArrayList();
         _input.start();
         while (_input.more()) {
           mstring n = (mstring)_input.next();
           if ( (n).notEquals((new mstring("Tom"))) ) {
           mobject _result = null;
           _data.add(_result);
           }
           i = new mint(i.v+1); // huge waste
         }
         _input.stop(); // put this in a finally
         mstream _str = new mListStream(_data);
         return _str;
       }
     }
     )
     .value(((mobject)names).toStream()));

from the simple looking:

list c = names:{string n, n!="Tom"};

  naturally, this was easy because even though it's big it's still a  
one-to-one mapping, albeit with a complicated template.

I am very much warming to your ideas about the declarative approach,  
although the pure declarative approach is not something that has  
proven popular with users.  All of the large tools that do this sort  
of thing are so collocated that programmers won't use it.  Your  
specific declarative system seems very nice and I would like to see  
more of it some time.

Ter