[antlr-interest] wildcard in tree grammar
Terence Parr
parrt at cs.usfca.edu
Tue Oct 21 10:15:17 PDT 2008
On Oct 21, 2008, at 12:02 AM, Gavin Lambert wrote:
> For analysis purposes, shouldn't ^(anything at all) be considered
> equivalent to a single node anyway? In much the same way that in
> the expression "x + (y + z)", "x" and "(y + z)" are both atoms (in
> terms of precedence).
>
> I'm a bit rusty on ANTLR's internal tree representation, but
> certainly in a "normal" tree this is the case -- any given node
> can have a subtree (or not), and you can uniquely refer to any
> subtree by pointing at its root node. I don't see why ANTLR would
> need to behave any differently (and I can see quite a few cases
> where it'd be beneficial if it could handle both cases at runtime,
> not compile time).
Hi. It turns out that parsing in two dimensions is a bit tricky ;)
antlr serializes trees too one-dimensional strings, injecting
imaginary down and up nodes to represent structure. so, we need to be
able to distinguish between
^(X Y Z)
and
^(X ^(Y Z))
A subtree is very different terms of lookahead from a linear list. A
B is different than ^(A B). Lookahead is AB on the first one (LL(2))
and ADOWN on the second one.
>
> Given the original problem mentioned in the issue:
> input: ^(not ^(and ^(= a b) ^(= c d)))
> rule: ^('not' ^('and' c51=. c52=.)) -> ...
>
> I don't see how this can be misinterpreted. While processing the
> 'and' subtree, it reads the first child node, discovers that it's
> a subtree, reads the whole thing in and assigns the root node
> (with dangling subtree) to c51. Then it does the same for the
> next subtree and c52.
Agreed. After playing around all day yesterday, I came to the
conclusion that the wild-card should in fact mean single node or
subtree, which is normally what you want. I have simply altered the
analysis to consider wild-card as really ^(. .*) :)
Ter
More information about the antlr-interest
mailing list