[antlr-interest] Tree parser eats up DOWN node when navigating optional child node

Junkman j at junkwallah.org
Thu Aug 5 11:50:07 PDT 2010


Gerald Rosenberg wrote:
>  ------ Original Message (Wednesday, August 04, 2010 5:21:33 PM) From:
> Junkman ------
> Subject: Re: [antlr-interest] Tree parser eats up DOWN node when navigating
> optional child node
>>
>> You wrote "AST ^( ^( PARENT A ) B )".  Can you describe the tree this
>> notates?  I can see it as a node sequence, but I don't know what tree
>> structure it is describing.
>>
>> Thanks for the reply.
>>
>> Jay
>>
> 
> The AST
> 
> ^( ^( PARENT A? ) B? )
> 
> should implement as
> 
> ( ( PARENT Token.DOWN A? Token.UP ) Token.DOWN B? Token.UP )
> 
> but is actually
> 
> ( PARENT Token.DOWN A? B? Token.UP )
> 
> Because parent_a is the root of parent, the parser is (for some reason)
> not actually generating the middle Token.UP Token.DOWN sequence.

It's because the parser generates trees, not node streams.

UP and DOWN nodes are marker nodes injected while flattening a tree, and
the resulting node stream naturally will contain neither empty DOWN-UP
sequence (edges to non-existing node) nor empty UP-DOWN sequence between
sibling nodes (duplicate edges).

So the parser's tree generation behavior makes sense.  What's new to me
is that tree parser interprets the rewrite expression differently (e.g.,
expecting the empty marker node sequences), and I think that is contrary
to TDAR's suggestion that tree parser rules, in a large part, can be
constructed simply by preserving the rewrite expressions from the parser
rules.

BTW, I found an open bug issue that may be related:

http://www.antlr.org/jira/browse/ANTLR-391

It's reported by and assigned to Terrance, so perhaps he can comment on
this?

> explains why P and PA work, but PB and PAB do not - after matching for
> A?, the tree parser is looking for UP, but finding B.  Not sure why
> Antlr is doing this - not expected.
> 
> Changing A and/or B to non-optional does not change this behavior.
> 
> If, however, you change the parent rule to
> 
> parent : parent_a B? -> ^( M parent_a B? )  ;
> 
> where M is an imaginary token (and make the corresponding change to the
> tree grammar), all four patterns will parse and match as expected:
> 
> AST:
> 
> ^( M ^(PARENT A? ) B? )
> 
> properly implements as
> 
> ( ( M Token.DOWN PARENT A? Token.UP ) Token.DOWN B? Token.UP )
> 
> 

Yes, but this is addressing a different issue - I want the tree parser
to recognize my AST, rather than changing the AST to fit the tree parser.

For now, though, I think I get the gist of how tree parser interprets
the rewrite expression (differently than parser), so I will have to
update my tree parser grammar accordingly, although it's odd that my
current (i.e. old) tree parser generates relatively few error messages...

Thanks for your help,

Jay

PS: This list doesn't seem the chattiest of mailing list, but please
chime in if I have it wrong above, or if you have other insight on the
subject.


More information about the antlr-interest mailing list