[antlr-interest] LPG WAS Retaining comments

Andy Tripp antlr at jazillian.com
Thu Mar 13 11:22:48 PDT 2008


Jim Idle wrote:
>   
>> -----Original Message-----
>> From: Andy Tripp [mailto:antlr at jazillian.com]
>> Sent: Thursday, March 13, 2008 8:31 AM
>> To: Jim Idle
>> Cc: antlr-interest at antlr.org
>> Subject: Re: [antlr-interest] LPG WAS Retaining comments
>>
>> Jim Idle wrote:
>>     
>>> [snip snide remarks about parse trees and ASTs]
>>> ...Good Grief
>>>       
>> Yea, I probably need to take an anger management class.
>>
>> I'm pretty sure it was the lack of an answer to "what good is a flat
>> AST?" that set me off.
>> So much so that I've lowered my estimate of its usefulness from 1% down
>> to 0%.
>>     
>
> Nobody said that there was any such use. 
You said: We can't 'know' that they did or didn't want a flat tree.
That, to me, implies that you're saying there's a case where someone 
might want a flat tree.

You said: but  I think it [a warning message about a flat tree] would 
actually annoy 100% of the time.
Again, that sounds to me like you think there is some case where a flat 
tree is what the person wanted.

You said: I think it [the case of a flat tree being produced] is an 
arbitrary example with no real need.
Given that this arose from someone who really did get a flat tree and 
was confused by it, and you saying
this case is "arbitrary" and there's no real need for a message, seems 
to imply (to me at least) that there's
some case in which a flat tree (and no warning) is reasonable.

You said: I see a lot of suggestions for warnings and errors and so on 
that surely seem reasonable
to the requester, but in fact are specific to their particular 
situation. If you start
spitting out warnings saying "You don't have any ^ characters", all you 
are going to do
is annoy those who know about that,

If you agree that there's no real reason to want a flat ast, why would a 
warning message by "specific to their particular situation"?
There is no "particular situation" where the flat AST is OK.
> I could make one up, but that would be pretty pointless. Oh well, suppose that your parser just works with the lexer to get some intermediate token form, but the language is such that you can't really infer any structure at that point as you need multiple passes to work anything out at all. Then your parser might build symbols (or just leave it to an AST walk), then pass the unshaped tree along for actual shaping now that it knows something about
>   
Yea, I know we can come up with contrived examples, and I agree that's 
irrelevent.
> But, as I said a bunch of times, whether a flat tree is any good or not is utterly and completely not the point. It is just that any auto generated structure probably isn't any better. The point is that spending a lot of effort 
Not necessarily a lot of effort. To keep a flag that indicates whether 
any ^ exists, and then call the existing
code that produces a parse tree in that case, should be easy. Though, I 
know no one should ever say that it would
be easy for someone else. If that's not trivial, just add a check just 
before returning an AST, and if it's flat,
return a parse tree instead. That could easily be done without any real 
changes to the depths of ANTLR.
> to try and produce a tree automagically makes no sense, as it will undoubtedly be worth very little to anyone. 
This original poster has just said he was confused because he didn't get 
a tree back. I know I would be, too,
and one other poster chimed in that he's been there. And we have this 
other paper here where the authors of the
paper and a whole tool actually built this whole tool without even 
acknowledging that it's just a parse tree!

> The parse tree is only useful if you really want to do something with parser tree, which more than likely is just display it, so there is no real point making that the default tree. 
That's coming from your point of view. But the newbie gets a lot out of 
it. He now *sees* that his
parser is working correctly, and sees, probably for the first time, the 
true structure of his grammar,
and the true structure of some parsed input.  That's huge to a newbie.

I've been quite surprised several times when I see that Terence build a 
parser that produced no tree.
"How does he even know it's doing anything reasonable?" I wonder. "How 
does he know that
there's not some bug, and the entire lexer is matching the whole input 
as a single word, and the parser
is completely wrong, but perfectly happy to parse that single token?". 
The answer  (besides that
he's good enough not to make such an error) might be that he glances at 
the parse tree or the generated
code or something. But for the person who has just built his first 
grammar (or for someone lazy like me),
it would be really nice to see something "tree-like" from the parser to 
see that it's working.

Yes, after this discussion, I'll just look at the parse tree, but I do 
think any newbie should be able to
see some sort of tree output from his parser to tell that it's working, 
before he's gotten into AST-building.
Those guys with this LPG paper are thinking the same way apparently.
> Basically as soon as you want to actually do something with the tree beyond look at it in ANTLRWorks, which gives you the parser tree anyway, 
Not everyone is going to use ANTLRworks. I found it to get in the way 
more than it helped.
> you will realize that you need to formulate you own structure. Hence, a default structure isn't really of any use to anyone, not even people new to the idea. 
I don't agree with your logic. If every house builder will eventually 
have to learn how to build a roof, does that
mean using a tarp in the meantime is of no use to him? The point of the 
tarp is to help him finish the thing he's
working on until he gets to the roof.  And a parse-tree can help a 
person to debug his parser first, before he gets
to the AST-building part.
> Most people will know what 'tree' means, but then need to work out what parser tree vs AST means and will soon be on their way to using it effectively.
>   
I agree. And I think in the end, they'll end up with "The parse tree is 
really an AST that's just got too much stuff in it".
Now you might say that's technically wrong. If so, fine, replace "AST" 
by just "tree" to make it right.
And so maybe they should start at the same place they end up: give them 
a tree with "far too much" rather than
"far too little".
>   
>> The "Trust me, I translate from one non-trivial language to another
>> without needing an AST,
>> but I wont give any details at all" thread from someone else is really
>> bugging me, too.
>>     
>
> Well, all the poster was saying (I think) is that he was able to write a translator by just using actions in the parser. I have done the same thing in the past. However, what Loring was trying to say (and most would agree I think), is that as soon as you get to a non-trivial case, while you can probably find a way to do without, you will find it better on many levels to construct an AST. I don't think that that is in any way a contentious statement.
>   
But it is contentious...I'm making it contentious by saying essentially 
"no, I don't think that what you're saying is possible - please give me
some details so we can all understand".

Just to do that again, consider keywords. If someone is saying they can 
translate language A to B, what if language
B has a keyword that's not in language A? You'll need to check for 
variables in language A who's name is a keyword in B
(say a variable named "null" when going from C to Java). And then you'll 
need to rename it, being sure not to use a
name that's already in use (requiring a symbol table). And to rename it, 
you'll need to know which references go with
which declarations (i.e. we may have more than one variable named 
"null"), and to do that, you'll need to know all the
scoping rules (i.e. this "null" variable declaration - how far does it 
go? Just this file? This namespace/package? Perhaps
a certain set of files as defined by some #include construct?).

I'm sure there must be languages where these issues don't come up, but I 
can't imagine how they could be non-trivial.

Naturally, if no one is even willing to name the two languages, that 
makes me really wonder what's going on.
> Anyway, I have spent enough time on this and probably bored the pants off everyone, so I think I will fix the reported 3.1 C runtime bugs instead ;-)
>   
Yea, sorry to bog you down from real work.
> Jim
>
>
>
>
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080313/ba4ae8c9/attachment.html 


More information about the antlr-interest mailing list