[antlr-interest] wildcard in tree grammar

Tue Oct 21 00:02:01 PDT 2008

At 07:26 21/10/2008, Terence Parr wrote:
 >wildcard is single node in tree grammar analysis but node or 
tree
 >at runtime
 >
 >We need both single node wildcard and tree wildcard. DFA 
analysis
 >sees '.' as a single node.
 >
 >If you say ^('+' . .), that expects two single nodes as children 

 >at analysis time. The problem is that at runtime we want 
wildcard
 >to match a subtree as well. We need to tell the analysis
 >specifically which one we mean. I can see a situation where you
 >want to match literally a single node versus a subtree. I don't
 >want to flip wildcard mean subtree.

For analysis purposes, shouldn't ^(anything at all) be considered 
equivalent to a single node anyway?  In much the same way that in 
the expression "x + (y + z)", "x" and "(y + z)" are both atoms (in 
terms of precedence).

I'm a bit rusty on ANTLR's internal tree representation, but 
certainly in a "normal" tree this is the case -- any given node 
can have a subtree (or not), and you can uniquely refer to any 
subtree by pointing at its root node.  I don't see why ANTLR would 
need to behave any differently (and I can see quite a few cases 
where it'd be beneficial if it could handle both cases at runtime, 
not compile time).

Given the original problem mentioned in the issue:
   input: ^(not ^(and ^(= a b) ^(= c d)))
   rule: ^('not' ^('and' c51=. c52=.)) -> ...

I don't see how this can be misinterpreted.  While processing the 
'and' subtree, it reads the first child node, discovers that it's 
a subtree, reads the whole thing in and assigns the root node 
(with dangling subtree) to c51.  Then it does the same for the 
next subtree and c52.

Introducing separate operators for "single node" and "subtree" 
seems like a kludge, and it means that flexibility is lost; 
certain possible input trees simply won't be able to be parsed any 
more (or at least not as nicely).