[antlr-interest] conditional tree rewrite
Marco Trudel
marco at mtsystems.ch
Thu Sep 17 06:32:42 PDT 2009
Dear all
I'm trying to adapt the "ANSI C grammar for ANTLR v3" so that it
generates an AST. So my first step was to change all rules like:
lvalue
: unary_expression
;
to
lvalue : lvalue2 -> ^(ART_LVALUE lvalue2);
lvalue2
: unary_expression
;
This already gives me a nice AST, except for expressions. For instance
"0" will be represented as:
ART_EXPRESSION
'-ART_ASSIGNMENT_EXPRESSION
'- ART_CONDITIONAL_EXPRESSION
'- ART_LOGICAL_AND_EXPRESSION
.
.
.
'- ART_PRIMARY_EXPRESSION
'- ART_CONSTANT
'- 0
I assume that the grammar has been build this way for performance
reasons. The rule for conditional_expression for instance looks like:
conditional_expression
: logical_or_expression ('?' expression ':' conditional_expression)?
;
So it's actually not a conditional_expression if the part in parentheses
isn't there. So my initial rewrite:
conditional_expression : conditional_expression2 ->
^(ART_CONDITIONAL_EXPRESSION conditional_expression2);
conditional_expression2
: logical_or_expression ('?' expression ':' conditional_expression)?
;
creates a "wrong" AST if it's actually not a conditional expression and
I'm now trying to create the tree conditionally. My first approach was
to change the rules this way:
conditional_expression
: conditional_expression2 -> ^(ART_CONDITIONAL_EXPRESSION
conditional_expression2);
| logical_or_expression
;
conditional_expression2
: logical_or_expression '?' expression ':' conditional_expression
;
But, since there are 16 rules to adapt, after about 6 I run into
"internal error: ...: could not even do k=1 for decision 49; reason:
timed out (>1000ms)". So this creates too much ambiguity and therefore I
went for syntactic predicates:
conditional_expression
: ( logical_or_expression '?' expression ':' conditional_expression )
=> logical_or_expression '?' expression ':' conditional_expression ->
^(ART_CONDITIONAL_EXPRESSION logical_or_expression '?' expression ':'
conditional_expression)
| logical_or_expression
;
But the same problem (too ambiguous to compile). So I'm really stuck now
since I don't know how to do this correct (meaning efficient). I could
imagine that this can be done with semantic predicates (I don't know
how) but I'm not sure about the runtime overhead when parsing a C file.
Or I can go over the created tree myself and clean up the expressions.
But the tree is already substantial bigger than it should be because of
the intermediate wrong "foo_expression", so I guess solving the problem
at the grammar level is better.
Any opinions? Maybe I miss the completely obvious solution...
I think I should add that I'm a complete ANTLR beginner; so everything I
wrote might make no sense at all.
Thanks and have a nice day
Marco
More information about the antlr-interest
mailing list