[antlr-interest] philosophy about translation

Andy Tripp antlr at jazillian.com
Fri Oct 6 08:49:43 PDT 2006


Jim O'Connor wrote:

>
>COBOL -> generic language -> JAVA. In the general case, any specific
>language -> generic language.  Generic language -> any specific
>language.
>
>  
>
I've thought a lot about this idea of having a generic intermediate 
representation. There are at least
a couple of products that say they do this. I can't see how to make it 
work. Say we have
a generic "add A to B" idea ("ADD A TO B" in COBOL, "B += A" in Java). 
If we're going to
produce "natural" Java code, we'll produce "A++" when B is 1. Or should 
it be "++A"???
No way to know, unless we also store that information, so we add some 
flag to our
generic representation. The guy working on the generic-to-Java part 
demands that the
guy working on the COBOL-to-generic part stores that information, even 
though it's
meaningless to the COBOL guy.

Then, the next day, the generic-to-Java guy realizes that A can be any 
arbitrary expression,
and that the expression must be evaluated first, before the assignment 
is made. He starts
going to make the generic representation more generic, and the COBOL 
sees him and says
"WTF are you doing? You can only add two numbers! I'm not going to deal 
with your
complicated data structure, I just want to store the fact that you're 
adding one number to another.

And then the next day, the problem goes the other way. The COBOL guy 
realizes that
the Java guy is going to need to know that B is 10 digits before the 
decimal, 5 after the decimal,
and a particular precision. The Java guy eventually realizes
that he can't even use "B += A", he's got to use BigDecimal. That 
doesn't bother him so much
as the term "digits".  "You're going to tell me how many 'digits' it 
is???" He screams.
"Don't you mean bytes?" No, he really does mean "digits". And you 
haven't seen nothin' yet,
Java guy. Wait till he tells you how to store those digits on disk in 
packed decimal format.

So in the end, the "generic representation" can't really be "generic". 
The Java guy just wants to know
whether a variable is a "int" or a "long", while the COBOL guy wants to 
say how many digits it is
before and after the decimal. OK, I suppose you could make it generic, 
you can store both,
or have the "generic representation" automatically
convert between the two. But things are going very badly and you just 
got started :)

andy



More information about the antlr-interest mailing list