[antlr-interest] philosophy about translation

Terence Parr parrt at cs.usfca.edu
Fri Oct 6 11:39:35 PDT 2006


On Oct 6, 2006, at 7:29 AM, Andy Tripp wrote:

>
>>
>> An interesting and difficult problem..thanks for bringing this  
>> up.   I'd have to think more.  Clearly some kind of non-text data  
>> structure  is needed for this.  I guess you'd build the Java  
>> template or AST and  then add the bits as you find them while  
>> traversing the COBOL.
>
> This is the key to the difference in the two approaches. Using an  
> AST, I kept finding myself gathering bits of information from
> around the AST. For example, say we're doing C to Java and I see  
> "if (a)". We first look for the declaration of a to see whether
> it's an int or not (it may not be because our "goto removal" phase  
> already ran, and it injects booleans). Next, we look at
> all references to "a", to see if it will be possible to change all  
> of them from "int" usages to "boolean" usages. If not, just change it
> to "if (a != 0)", but if so, go ahead and change the type to  
> boolean, and make whatever changes are needed at each reference.
>
> If you try to do that sort of thing in a tree-walking way, it will  
> be a mess, I think.

   Aren't these standard operations and data structures?    Symbol  
table, use-def chains, flow analysis.  The tree walk can simply ask  
questions of these data structures.

>> My main point is that it's ok to have multiple tree structures, L  
>> and  L', but the union and/or slow morphing of one to the other is  
>> a total  pain I've found.
>
> Yes, it's a royal pain, but if you start with the requirement that  
> you will produce "natural" code, there's no choice.

   Well, I suppose anything is possible, it's just a matter of how  
convenient.  And you are saying it's inconvenient really not impossible.

> I think just this simple example that I brought up before actually  
> brings the problem to the surface:
>
> String hello = "hello";
> String world = "world";
> printf("%s %s\n", hello, world);
>
> ...becomes...
>
> System.out.println("Hello World");
>
> I can't see how that can be done by treewalking. By the time the  
> code is actually written to implement "printf to System.out.println",
> there will be almost no "tree-transformation" or even "tree  
> walking" logic.

   The logic is identifying that you have entered a list of  
statements and you see a print statement.  The translation logic is a  
simple one-to-one mapping from printf to println just as you would do  
in a rule right?  The only problem is discovering what should be the  
expression.  Either in a previous phase I would have done constant  
propagation or in your case you ask the question or something in one  
of your declaration rules.  Do you insert actions in your rules to go  
check data structures?  Surely you don't write a rule that has a   
context-sensitive pattern  asking if there have been all possible  
variable declarations before the print, right?

> As for the try/catch, all the work is in finding a good "level" to  
> insert the try/catch. For example, if we have three consecutive
> read() calls, best to put them into a single try/catch. If we need  
> to catch both FileNotFoundException and IOException
> for one statement, and just IOException for the following  
> statement, what do we do?

  how do you handle that?  I very is to to learn more about your  
approach; I see you talking about how the tree walking won't work,  
but I don't see how yours will work.  It is very interesting and I  
want to learn more.

> Thanks for your patience - guess I'm a natural contrarian :)

   Always good for an excellent discussion and to shake things up...
Ter


More information about the antlr-interest mailing list