[antlr-interest] Summary of ANTLR Issues

Tiller, Michael (M.M.) mtiller at ford.com
Mon Jul 7 13:47:55 PDT 2003



> -----Original Message-----
> From: Terence Parr [mailto:parrt at cs.usfca.edu]
> Sent: Monday, July 07, 2003 3:24 PM
> To: antlr-interest at yahoogroups.com
> Subject: Re: [antlr-interest] Summary of ANTLR Issues
>

> On Monday, July 7, 2003, at 10:22  AM, Tiller, Michael (M.M.) wrote:
> 
> > My goal here is to be as constructive as possible and I hope nobody 
> > takes this too personally.
> 
> Heck no.  Sounds like a good idea.  Loring, Monty and I will be in OR 
> this Fri for a 3 day "cabal".  Perfect timing.  We're talking about 
> what we don't like and what we want to add etc...

The timing wasn't a coincidence.  I noted your schedule and moved my
timetable on feedback up to catch your Cabal.

> > Another interesting observation.  Many books use the word 
> "terminals" 
> > when discussing the tokens that the lexer generates.  I find it 
> > somewhat ironic therefore that without synthetic nodes, these 
> > "terminals" end up being the nodes in the AST rather than 
> the leaves.  
> > In other words, they are the leaves of the grammar, but the 
> nodes in 
> > the AST.  Perhaps I'm just dense, but I haven't really been able to 
> > reconcile that in my head.  It seems to me that it makes sense to 
> > introduce nodes that are related to *rules* (some rules, not all 
> > rules) as well as tokens.
> 
> Imaginary tokens are for grouping things in the tree as you say like 
> DECLARE node or STATEMENT_LIST.  There are an infinite number of 
> grammars that can recognize Java, but they can all generate the same 
> AST (the one I have is pretty decent).  SO, the issue is why tie your 
> internal structure of the language to something that reflects the 
> personal grammar writing style of someone.  Better to focus on the 
> abstract structure and then make the tree building code in 
> the grammar do that.

OK, I'm with you so far.

> You are asking for a parse tree not an AST when you mention the word 
> "rules".
> 
> ...
> > declaration<AST=DeclarationNode>
> >   : type name ";"
> >   ;
> 
> In this case because we don't like parse trees, we like ASTs. :)

OK, hold on.  I specifically said I wanted this for *some* rules (e.g. rules
that also corresponded to significant structural entities in the AST).  I
certainly understand that you don't want to do this for all rules.  I also
understand that your imaginary tokens may not even correspond to specific
rules.  BUT, the currently capabilities only let you associate tokens
with...well...tokens (which is also very specific to the parse tree :-).
Being able to associate them with *some* rules would still be an
improvement, e.g.

The idea would be to use this judiciously in places in the grammar where the
rules also represented logical AST nodes.

> > Heterogeneous ASTs:
> >  
> > ...
> 
> Well,
> 
> expr : atom (PLUS^ atom)* ;
> 
> would be pretty hard to beat, n'est-ce pas?  I'm mean one character, 
> right? ;)  I admit that the notation is not necessarily so good for 
> declarations, but works well for statements too:
> 
> returnStat : "return"^ expr SEMI ;

I agree with you that it is hard to beat one character for brevity, but I'm
still not quite sure what this is in response to?  This was in the
"Heterogeneous AST" section of my note but I don't see any real
correspondence to hetero ASTs here.  I agree the tree construction is nice
in ANTLR and very important.  I know about "^" and I use it.

> > OK, one last thing.  One of the issues that anyone working with a 
> > custom language these days has to face is "Why didn't you 
> just develop 
> > and XML schema and forget about all this lexer parser stuff?".  The 
> > answer to this question is pretty much what Terrence laid out in 
> > "Humans should not have to grok XML" which addresses the 
> issues nicely 
> > (but still doesn't quite convince most people who only have 
> on neuron 
> > and it always fires "XML!").
> 
> Hoooray for me! ;)

By the way, I should clarify my parenthetical comment above.  To capture the
true intent, I should have written:

(but still doesn't quite convince THOSE people who only have one neuron and
it always first "XML!")


--
Mike

 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list