[antlr-interest] How do you build this AST?
Sam Harwell
sharwell at pixelminegames.com
Tue Feb 17 06:17:50 PST 2009
If you have the option output=AST specified, then your rewrite rules are the default behavior. Unnecessary rewrites will incur a significant performance penalty since they are not optimized. The penalty is masked if the grammar is not properly left-factored, but it should still be avoided. The ^,! Notation has better performance than rewrites as well. IE, this:
root : subroot1? subroots* -> subroot1? subroots* ;
subroots : subroot2 | subroot3;
Is the same as both this:
root : subroot1? subroots*;
subroots : subroot2 | subroot3;
And this:
root : subroot1? (subroot2 | subroot3)*;
If you absolutely must have the Root node in your output tree, use this:
root : subroot1? (subroots+=subroot2 | subroots+=subroot3)* -> ^(Root subroot1? subroots*);
Sam
-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Johannes Luber
Sent: Tuesday, February 17, 2009 5:46 AM
To: Gabriel Petrovay; antlr-interest at antlr.org
Subject: Re: [antlr-interest] How do you build this AST?
> Hi all,
>
> (In this e-mail I talk about result obtained with ANTLR 3.1.1)
>
> Below you have a simple sample grammar. I what to get from this
> grammar an AST as illustrates below for the following input:
> "example child leaf other child leaf child leaf done".
>
> I am not asking you to check is the grammar is correct. Just give me
> please an example on how should I write this in antlr3 grammar in
> order to get a tree like below. Or give me an example for one rule and
> I'll find the way for the others.
>
> It seems that the output=AST option is not enough. Now I only get a
> CommonTree node having 9 CommonTree children each representing one of
> the lexer tokens.
>
> (I have found a solution though, but there is a problem with it. See
> bottom of email. Is this the way to do it?)
>
> //---------------------------------
> grammar example ;
> options {
> output=AST;
> }
>
> // parser rules
> root : subroot1? (subroot2 | subroot3) ;
> subroot1 : EXAMPLE ;
> subroot2 : child1 child2 ;
> child1 : CHILD subchild ;
> child2 : OTHER CHILD subchild ;
> subchild : LEAF ;
> subroot3 : child1 DONE;
>
> // lexer rules
> EXAMPLE : 'example' ;
> LEAF : 'leaf' ;
> CHILD : 'child' ;
> OTHER : 'OTHER' ;
> DONE : 'done' ;
> WS : ''\t' | '\s' | '\r' | '\n' {$channel=HIDDEN} ;
> //---------------------------------
>
> !USE MONOSPACE FONTS TO PROPERLY SEE THE BELOW TREE!
> ------ root ------
> / | \
> / | \
> subroot1 subroot2 subroot3
> / \ |
> / \ child1
> child1 child2
> | |
> subchild subchild
>
>
> Thank you!
> (very much)
>
>
>
> Possible solution? (Problem described after the grammar.)
>
> //---------------------------------
> grammar example ;
> options {
> output=AST;
> }
>
> // imaginary tokens (nodes)
> tokens {
> Root;
> Subroot1;
> Subroot2;
> Subroot3;
> Child1;
> Child2;
> Subchild;
> }
>
> // parser rules
> root : subroot1? (subroot2 | subroot3) -> ^(Root subroot1?
> subroot2? subroot3?);
> subroot1 : EXAMPLE;
> subroot2 : child1 child2 -> ^(Subroot2 child1 child2);
> child1 : CHILD subchild -> ^(Child1 subchild);
> child2 : OTHER CHILD subchild -> ^(Child2 subchild);
> subchild : LEAF;
> subroot3 : child1 DONE -> ^(Subroot3 child1);
>
> // lexer rules
> EXAMPLE : 'example' ;
> LEAF : 'leaf' ;
> CHILD : 'child' ;
> OTHER : 'OTHER' ;
> DONE : 'done' ;
> WS : ''\t' | '\s' | '\r' | '\n' {$channel=HIDDEN} ;
> //---------------------------------
>
>
> The problem with the above grammar apeares with the following example.
> Assume that 'root' has the following rule:
> root : subroot1? (subroot2 | subroot3)* -> ??? how to
> transform this ??? ;
> This rule allows any sequence of subroot2's and subroot3's.
>
> The problem with this transformation is that the '|' operator is not
> allowed in the rule transformation. The following is illegal:
> root : subroot1? (subroot2 | subroot3)* -> subroot1? (subroot2
> | subroot3)* ;
>
> If I write it like:
> root : subroot1? (subroot2 | subroot3)* -> subroot1? subroot2*
> subroot3* ;
> then I loose the ordering.
>
> How can I solve this? Am I on a wrong track?
Try
root : subroot1? subroots* -> subroot1? subroots* ;
subroots : subroot2 | subroot3;
Johannes
--
Jetzt 1 Monat kostenlos! GMX FreeDSL - Telefonanschluss + DSL
für nur 17,95 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list