[antlr-interest] philosophy about translation

Andy Tripp antlr at jazillian.com
Thu Oct 12 07:12:55 PDT 2006


Kay Roepke wrote:

>
> On 12. Oct 2006, at 3:56 Uhr, Andy Tripp wrote:
>
>> The problem is that people value program correctness over  
>> readability. Crazy as it sounds, I prefer this program:
>> System.out.println("hello, world")
>> ...over this one...
>> System.out.print("he" + 'l' + "l"); System.out.println("o");
>>
>> The first one is clean but incorrect, the second one is correct but  
>> not clean.
>> The first one will be much easier to maintain over time. Same for  
>> million-line programs...I think it's better
>> to have something that's well written than something that's  
>> completely bug-free!
>
>
> I don't see why the first one is incorrect, but ok, I buy it ;)

It's missing the semicolon :)

> One thing I cannot on agree on, is that a translator is allowed to  
> introduce subtle bugs in my code. 

This is the mindset I'm trying to change...it's at the heart of what I'm
doing.
Every translator that I know of other than Jazillian goes for
correctness, and every one (that I've seen)
produces something like 50 lines of code from the one-line "hello,
world" program.
I could talk all day about why it's perfectly reasonable to allow bugs
to be introduced (as is done
every time a human writes or rewrites anything), but I'll spare you :)

> I think there's a tradeoff between  readability and correctness. 
> Granted, I can live with border cases,  where the vendor says "Listen, 
> for constructs like malloc you have  triple-check the translation, 
> because we don't have malloc in Java  and have to make assumptions." 
> but these cases should be well defined  and clearly stated. OTOH I 
> assume you do that sort of thing with your  customers.
> That said, I value code cleanliness a lot, having worked on quite  
> large projects, cleaned them up and seen them deteriorate again in  
> the span of a couple of months. So I think I know what you mean.

In our case, we'll see if, say, a memset() call matches any patterns of
usage that we've seen before and have some
Java equivalent. If not, we'll leave it in the code, give a warning, and
you'll have a memset() call in the middle of your Java code.
So, obviously, our translator works much better on "vanilla business
logic" code than it does on low-level library code.

>
>>> This is not generally the case with  artificial languages.
>>
>>
>> Generally, but then again, it's 10pm, and my officemate is looking  
>> at a line of code like this:
>> typedef char MYCHAR[25];
>> ...which of course tells us replace any occurances of "MYCHAR" with  
>> "char[25]".
>
>
> Yes? That's the way typedefs work. Am I missing anything? How's that  
> violating the grammar?

typedefs are usually of the form:
typedef THIS IS THE REPLACEMENT    THING_TO_BE_REPLACED;
this one is of the form:
typedef PART_OF_REPLACEMENT THING_TO_BE_REPLACED REST_OF_REPLACEMENT.

This illustrates the AST vs. token stream mentality really well. The "oh
no!" moment that I get when
I see this out-of-wack-token-sequence-meaning is a bad one. But  this
"I'm going to
now have to think a bit to make sure I understand this" thinking that
I'm experiencing here is very similar
to the "I'm going to have to think now about what the AST looks like"
feeling that I'd have to do ALL THE TIME
with ASTs.

Andy.

>
> -k
>




More information about the antlr-interest mailing list