[antlr-interest] flat AST tree
Torsten Curdt
tcurdt at vafer.org
Sun Aug 24 04:32:22 PDT 2008
> What hierarchy? How can ANTLR possibly know which of the tokens is
> the logical root of the tree, and of the subtrees? Operators, for
> example, commonly appear infix, while things like class declarations
> appear prefix, with additional modifiers optionally prefixed and
> suffixed to that. There's just no way it can create a sensible
> structure without information from the grammar author, so it doesn't
> try.
Well, of course there is an implicit hierarchy
ruleA: ruleB;
ruleB: ruleC;
Is basically:
ruleA -> ruleB -> ruleC
That is what ANTLRWorks is displaying and that is what I expected the
AST to be. That it might not be the desired final structure - but
that's a different thing.
> You can either use a full rewrite notation, as you showed in your
> original email, or you can use a more concise representation for the
> simple case where there's only one root and you want the tokens to
> appear in order.
>
> For example, the rewrite you posted:
> classDeclaration : 'class' Identifier ( 'extends' identifierList)?
> classBody -> ^('class' Identifier ( 'extends' identifierList)?
> classBody) ;
>
> Could have just been written like this:
> classDeclaration : 'class'^ Identifier ('extends' identifierList)?
> classBody;
>
> Either way, it will produce a tree like this:
> ^('class' Identifier (contents of classBody...))
> or this:
> ^('class' Identifier 'extends' (contents of identifierList...)
> (contents of classBody...))
>
> You can extend this to modify the tree a bit; for example to put the
> 'extends' clause into a subtree if it's present, you can use this:
> classDeclaration : 'class'^ Identifier baseClass? classBody;
> baseClass : 'extends'^ identifierList;
>
> If you wanted to change the order things appeared in, though, or
> insert additional tokens not present in the input (imaginary
> tokens), then you have to use a -> rewrite. (You can omit tokens
> present in the input with !, though.)
I see
> The things to remember are:
> 1. You can either use a -> rewrite *or* the ^ and ! operators
> within any given rule -- never both. (You get *really* obscure
> errors if you violate this.) Different rules can use different
> types, though.
> 2. -> gives you more flexibility but you need to restate things,
> and you need to make sure that the cardinality on both sides matches
> 3. If you're not using a -> rewrite, then you can have at most one
> ^ within any given rule.
> 4. The root node must be a single token. You can't make a subrule
> or block into a root.
> 5. You can use -> multiple times within a rule, but each time it
> sets the output of the entire rule. But you can refer to a tree
> built by an earlier use of ->, so you can build up recursive trees
> very easily.
Thanks for summarizing that!
cheers
--
Torsten
More information about the antlr-interest
mailing list