[antlr-interest] Re: Translators Should Use Tree Grammars

atripp54321 atripp at comcast.net
Tue Nov 23 12:48:01 PST 2004



Anakreon,

Thanks for the help. Thanks to your code, I think I
finally see why I'm seeing things differently....

You basically invoke code at the start and/or end of
each visit to each tree node. For example, you have this
in js_tree.g:

  | #(WHILE <pre_while> expr statement <post_while>)

...and this in js_tree_php.act:

@pre_while : { incLabels(); }
@post_while : { decLabels(); }

Most of these chunks of code are just a few lines, with
a few a bit larger (@assign is 50 lines, for example).
About 800 lines in total of automatically-fired-by-treewalker
code.

Why Am I seeing things diffently from (most) everyone else here?
Wthen I look at my rules, and ask "how would he do that?"
and the answer is almost always "he wouldn't". My translator
is not just translating the core language, but the core libraries
too. And the translations are not just simple syntactics,
they're high-level rewrites.

Just to pick one example, many C functions return error codes.
For example, fopen() returns a 0 if it can't open a file.
That needs to be replaced with exception handling in Java.
So I have a list of the functions and the values they return
on error. I check for calls to the functions, and look for
various patterns of error checking, such as:
-----------------------------
if (fopen(xxx) == 0) { // return value checked
// error code
}
else {
// non-error code
}
-----------------------------
v = fopen(xxx); // return value stored and checked later
...
if (v != 0) {
// non-error code
}
-----------------------------

And once I've found one of these patterns, I store the
"error code" and "non-error code" (both of which may be more
than just an AST, they are a sequence of statements),
and produce the corresponding try/catch block in Java.
And, if there are several statements that may throw
the same exception, we don't want this:
try {
  open();
} catch(IOException e) {
  // error code
}
try {
  read();
} catch(IOException e) {
  // error code again!
}

So, I've got to do some real work to figure out where to
put the try/catch stuff.

Correct me if I'm wrong, but I don't think your translator
is doing anything nearly that complex. I have many simple
syntactic rules as you do, but I also have many complex 
rules like this one.

So, now that I think about it,
maybe even this one rule involves several things that you probably
wouldn't see in your typical language-to-language translator:

* handling of library functions, not just core language
* replacing whole mechanisms/paradigms (error codes from 
  library functions being replaced by exception handling)
* complex pattern matching (e.g. checking for various comparisons
  the return value like ==, !=, <, etc. and even checking for
  storage in a variable and then usage of that variable)

In case you think that this rule is just an exceptionally
complex one, here are a few other examples:

* structs, unions, and enums become whole Java classes, including
  constructors and changes at each reference
* memory management is done "by hand" in C must be changed to
  use Java objects.
* I handle multiple input files, and change C file names
  to Java ones (including combining "hello.c" and "hello.h" into
  "Hello.java"
* There are different rules in Java and C for where an array
  can be initialized.
* The syntax and semantics for array declaration are different
  (In C, it's "struct person a[3];", in Java it's 
   "Person a[] = new Person[3];" plus a loop to initialize it)

Those are just some of my rules that start with "A"
(AddClassRule, AddConstructorRule, ApplicationFileRenameRule,
ArrayInitializerRule, ArrayDeclarationRule).

Now I'm really starting to wonder about how much all the
language-to-language translators out there are really doing.
I know for a fact that the C-to-Java ones (other than Jazillian)
are only doing trivial syntactic changes
(see http://jazillian.com/competition.html for details).

What's the most complex translator that that people
have seen? (Complex meaning functionality, not internals).

Andy





 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the antlr-interest mailing list