[antlr-interest] philosophy about translation

Tue Oct 10 08:12:35 PDT 2006

Sohail Somani wrote:

>
>>Its weird that you're walking token streams, but what it seems it that
>>you're implementing the recursive descent parser by hand...
>>    
>>
But a parser converts a token stream to an AST, I'm not using an AST, 
but dealing with
the stream directly because that seems easier to me. I find it easier to 
search for
pattern "f(...) {" to look for a function, rather than search a tree for 
a node of type
FUNCTION_DECLARATION.

>>    
>>
>>b) It's not actually clear, in COBOL, what a function *is*. There are 
>>paragraphs, which
>>typically map to a function, but there can also be "stray code" at the 
>>top of a file that's
>>not in a paragraph but needs to be in a function.
>>    
>>
>
>I believe there is an unspoken rule that all bets are off with COBOL?
>  
>
Yes, I suppose. But even with C and C++, what seems simple may not be. 
For example, you'd think that
struct person p[100];
might correspond to a single line of Java, but it doesn't (you need to 
initialize the
array). So suddenly, out of nowhere, you may have to add a static block 
of code. That's
a typical one-to-many type of thing.

>  
>
>>c) I have a feeling there might be a problem if I move code around. I 
>>can't think of a specific
>>example right now, but that's my general thinking for avoiding symbol 
>>table use if I can - better
>>to have a single data structure (in my case a token stream) rather than 
>>two (a token stream
>>and a symbol table) that need to be kept in sync.
>>    
>>
>
>Well, in your case you're managing both. In my case, I just worry about
>telling antlr the grammar and managing the symbol table appropriately.
>
>For my compiler, I needed to spit out lots of warnings since it was
>basically a cfront type deal, albeit not for C++ (thankfully!) I found
>that having a crude symbol table was very easy and natural. I couldn't
>imagine re-parsing the token stream just to determine the type of a
>variable, something I might need to do more than once. For example, if
>you see:
>
>	a.b();
>
>You might need to know if a is a class (this making b a static function
>call) or an object. Once you have decided that a is one or the other, I
>don't see why you'd do it again...
>  
>
because it may certainly have changed since you last looked at it..in 
other words, the symbol
table can be a mess to maintain. In my case, the return type of b() may 
have change, the name of it
may have changed, its argument types may have changed.

Also, my tool does not have to be fast - if it takes an hour to 
translate some code, when it could
have taken 2 seconds if I had designed it "right", that's ok.