[antlr-interest] mild simplification and tree grammars
Giampaolo Tomassoni
Giampaolo at Tomassoni.biz
Tue Apr 20 04:51:36 PDT 2010
Well, after your good advices I could get a bit further in my grammar.
Now I have a general question and a specific one about tree grammars.
I would like to attempt a kind of mild simplification of an arithmetic and
boolean expression.
I see that a lot of interesting work can be done with a replace grammar like
the following:
s : ^(UMINUS r=NUMBER) { shc = true; }
-> NUMBER[num($r).negate().toString()]
| ^(NOT FALSE) { shc = true; }
-> TRUE
| ^(NOT TRUE) { shc = true; }
-> FALSE
| ^(PLUS l=NUMBER r=NUMBER) { shc = true; }
-> NUMBER[num($l).add(num($r)).toString()]
| ^(PLUS l=NUMBER {num($l) == BigDecimal.ZERO}? r1=.) { shc = true; }
-> $r1
| ^(PLUS l1=. r=NUMBER {num($r) == BigDecimal.ZERO}?) { shc = true; }
-> $l1
...
Where 'shc' stands for "Something Has Changed" and is going to be used to
decide if a further execution of the simplifier grammar is or isn't
required. num(CommonTree) is a shorthand for "new
BigDecimal(node.getText())".
The general question is: is there a simply way to simplify things like:
1 + a + 2
?
Of course I get something like (PLUS (PLUS NUMBER IDENT) NUMBER) and I see
it could be simplified as (PLUS (PLUS NUMBER NUMBER) IDENT). I may guess
some that constructs to obtain this are possible, but they seems to me to
increase the complexity of the grammar a lot and I don't know if I'm going
to swim in too deep waters. I mean, is this the way compilers do simplifies
expression, or instead there is something well-unknown to my knowledge? Any
acronym to spare?
The specific question is: the above (piece of) tree grammar needs to compile
with backtrack=true, which I don't like. I'm going to re-create a LLR
grammar doing the same. I see it mimes a lot the parsing grammar I used to
generate the source tree, but with a lot more cases in rules. This sounds
fine to me, but then I'm still getting some problem:
protected
conditionalOrExpression
: OR FALSE r=conditionalOrExpression {shc=true;} -> $r
| OR TRUE conditionalOrExpression {shc=true;} -> TRUE
| OR l=conditionalAndExpression r=conditionalOrRightSide[$l.tree] ->
$r
| conditionalAndExpression
;
protected
conditionalOrRightSide [CommonTree l]
: FALSE {shc=true;} -> $l
| TRUE {shc=true;} -> TRUE
| r=conditionalOrExpression -> OR $l $r
;
conditionalOrExpression2 rises the error "(137): reference to undefined
label in rewrite rule: $l", which is the same I got in the parsing grammar
and which was circumvented by adopting a different notation (thanks to
John).
I have ANTLR 3.2. Is it a bug or am I the bug?
Regards,
Giampaolo
More information about the antlr-interest
mailing list