[antlr-interest] philosophy about translation

Fri Oct 27 15:03:28 PDT 2006

> I disagree. With ANTLR treewalkers or even any other tool and not 
> treewalkers when you build
> ASTs and then transform them to other ASTs, you have to be intimately 
> familiar with the
> shape of those ASTs (i.e. the grammar for the input and output 
> languages). I'd rather not have
> to know that.

I see no way to avoid this and produce a good result. However there are few languages such that being familiar with the type of tree that one language produces does not help with the tree that another produces. In fact I think that that TreeParser grammar is a huge aid to being able to 'read' the tree. 

> I know that the COBOL sentence:
> ADD 1 TO A GIVING B.
> ...maps to the Java statement...
> B = A + 1;

> ...and yet I have little clue as to what the COBOL or Java ASTs look like.
> So I really do want to write:
> ADD v1 TO v2 GIVING v3 --> v3 = v1 + v2; 

Taking your COBOL example though, I think that the issue of translating one language to another is much more complex in general than this and that the issue would be being intimately familiar with the languages, the tree surely being a relatively easy thing to pick up? What is the PIC of A and B for instance, where is the meta data about this to be stored (front end, encoded in IR, back end?), what significance does this have on the target language? What is the behavior of the VM when you produce System.out.println("String " + A); // What happens internally with A, will I produce code that cause STR->INT->STR conversion all the time. COBOL will reject things that don't fit the PIC... etc.

What happens with:

MOVE MOUNTAIN TO MOHAMMED;

A universal front end->IR->high level language methodology is probably not possible. 

Surely the rule matching scenario would be able to formulate an unknown sequence of events such that ruletriggerA changes some part of the input which fires ruletriggerB, which changes some part of the input that fires ruletriggerA... 

It would seem that one has a specific project "Source code for app A1 in lang L1 translated to A1 in lang L1", or "Any App AN in L1 to L2" or "Lang L1 to Lang L2" or "LN1 to LN2; N1 # N2" and so on. I will ignore A1->A2 ;-). 

The amount of support library programming in lang L2 would probably far outweigh other issues and I think that assuming you can find good enough programmers (big if though I admit) that just rewriting it in L2 would be better anyway. There is probably no way to avoid the new source code looking like the input source code and that a programmer of LANG L2 would say "What the bejesus is this?"

For a translation solution, I suspect then that you just "type it in" and end up with a tool specific to the thing you want to translate, starting with tree walkers then probably some manual hard coded passes. Of course, you could consider this rule set approach part of the latter phase with a more specific task at hand. I think that this yields a practical solution to the task in hand and that you could knock out 10 of these in the time taken to deal with more general solutions ;-)

Jim

-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.408 / Virus Database: 268.13.15/503 - Release Date: 10/27/2006