[antlr-interest] conditional tree rewrite
Jim Idle
jimi at temporal-wave.com
Thu Sep 17 07:57:35 PDT 2009
Marco,
You can rewrite inside any alt so only rewrite as conditional if you
take that alt. Also, avoid imaginary token bloat- you don't need the
imaginaries in expression trees (generally) and can just use the ^
operator. The AST is more about being unambiguous and easilly walkable
than how it looks to you when you print the tree. You probably want to
lose the 'literals' and make these lexer defined tokens. It will be
easier when you build the tree walker.
Jim
On Sep 17, 2009, at 6:32 AM, Marco Trudel <marco at mtsystems.ch> wrote:
> Dear all
>
> I'm trying to adapt the "ANSI C grammar for ANTLR v3" so that it
> generates an AST. So my first step was to change all rules like:
>
> lvalue
> : unary_expression
> ;
>
> to
>
> lvalue : lvalue2 -> ^(ART_LVALUE lvalue2);
> lvalue2
> : unary_expression
> ;
>
> This already gives me a nice AST, except for expressions. For instance
> "0" will be represented as:
> ART_EXPRESSION
> '-ART_ASSIGNMENT_EXPRESSION
> '- ART_CONDITIONAL_EXPRESSION
> '- ART_LOGICAL_AND_EXPRESSION
> .
> .
> .
> '- ART_PRIMARY_EXPRESSION
> '- ART_CONSTANT
> '- 0
>
> I assume that the grammar has been build this way for performance
> reasons. The rule for conditional_expression for instance looks like:
>
> conditional_expression
> : logical_or_expression ('?' expression ':'
> conditional_expression)?
> ;
>
> So it's actually not a conditional_expression if the part in
> parentheses
> isn't there. So my initial rewrite:
>
> conditional_expression : conditional_expression2 ->
> ^(ART_CONDITIONAL_EXPRESSION conditional_expression2);
> conditional_expression2
> : logical_or_expression ('?' expression ':'
> conditional_expression)?
> ;
>
> creates a "wrong" AST if it's actually not a conditional expression
> and
> I'm now trying to create the tree conditionally. My first approach was
> to change the rules this way:
>
> conditional_expression
> : conditional_expression2 -> ^(ART_CONDITIONAL_EXPRESSION
> conditional_expression2);
> | logical_or_expression
> ;
> conditional_expression2
> : logical_or_expression '?' expression ':' conditional_expression
> ;
>
> But, since there are 16 rules to adapt, after about 6 I run into
> "internal error: ...: could not even do k=1 for decision 49; reason:
> timed out (>1000ms)". So this creates too much ambiguity and
> therefore I
> went for syntactic predicates:
>
> conditional_expression
> : ( logical_or_expression '?' expression ':'
> conditional_expression )
> => logical_or_expression '?' expression ':'
> conditional_expression ->
> ^(ART_CONDITIONAL_EXPRESSION logical_or_expression '?' expression ':'
> conditional_expression)
> | logical_or_expression
> ;
>
> But the same problem (too ambiguous to compile). So I'm really stuck
> now
> since I don't know how to do this correct (meaning efficient). I could
> imagine that this can be done with semantic predicates (I don't know
> how) but I'm not sure about the runtime overhead when parsing a C
> file.
> Or I can go over the created tree myself and clean up the expressions.
> But the tree is already substantial bigger than it should be because
> of
> the intermediate wrong "foo_expression", so I guess solving the
> problem
> at the grammar level is better.
>
> Any opinions? Maybe I miss the completely obvious solution...
> I think I should add that I'm a complete ANTLR beginner; so
> everything I
> wrote might make no sense at all.
>
>
> Thanks and have a nice day
> Marco
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list