[antlr-interest] philosophy about translation
Terence Parr
parrt at cs.usfca.edu
Fri Oct 6 11:39:35 PDT 2006
On Oct 6, 2006, at 7:29 AM, Andy Tripp wrote:
>
>>
>> An interesting and difficult problem..thanks for bringing this
>> up. I'd have to think more. Clearly some kind of non-text data
>> structure is needed for this. I guess you'd build the Java
>> template or AST and then add the bits as you find them while
>> traversing the COBOL.
>
> This is the key to the difference in the two approaches. Using an
> AST, I kept finding myself gathering bits of information from
> around the AST. For example, say we're doing C to Java and I see
> "if (a)". We first look for the declaration of a to see whether
> it's an int or not (it may not be because our "goto removal" phase
> already ran, and it injects booleans). Next, we look at
> all references to "a", to see if it will be possible to change all
> of them from "int" usages to "boolean" usages. If not, just change it
> to "if (a != 0)", but if so, go ahead and change the type to
> boolean, and make whatever changes are needed at each reference.
>
> If you try to do that sort of thing in a tree-walking way, it will
> be a mess, I think.
Aren't these standard operations and data structures? Symbol
table, use-def chains, flow analysis. The tree walk can simply ask
questions of these data structures.
>> My main point is that it's ok to have multiple tree structures, L
>> and L', but the union and/or slow morphing of one to the other is
>> a total pain I've found.
>
> Yes, it's a royal pain, but if you start with the requirement that
> you will produce "natural" code, there's no choice.
Well, I suppose anything is possible, it's just a matter of how
convenient. And you are saying it's inconvenient really not impossible.
> I think just this simple example that I brought up before actually
> brings the problem to the surface:
>
> String hello = "hello";
> String world = "world";
> printf("%s %s\n", hello, world);
>
> ...becomes...
>
> System.out.println("Hello World");
>
> I can't see how that can be done by treewalking. By the time the
> code is actually written to implement "printf to System.out.println",
> there will be almost no "tree-transformation" or even "tree
> walking" logic.
The logic is identifying that you have entered a list of
statements and you see a print statement. The translation logic is a
simple one-to-one mapping from printf to println just as you would do
in a rule right? The only problem is discovering what should be the
expression. Either in a previous phase I would have done constant
propagation or in your case you ask the question or something in one
of your declaration rules. Do you insert actions in your rules to go
check data structures? Surely you don't write a rule that has a
context-sensitive pattern asking if there have been all possible
variable declarations before the print, right?
> As for the try/catch, all the work is in finding a good "level" to
> insert the try/catch. For example, if we have three consecutive
> read() calls, best to put them into a single try/catch. If we need
> to catch both FileNotFoundException and IOException
> for one statement, and just IOException for the following
> statement, what do we do?
how do you handle that? I very is to to learn more about your
approach; I see you talking about how the tree walking won't work,
but I don't see how yours will work. It is very interesting and I
want to learn more.
> Thanks for your patience - guess I'm a natural contrarian :)
Always good for an excellent discussion and to shake things up...
Ter
More information about the antlr-interest
mailing list