[antlr-interest] Exploit ambiguity in tree rewriter

nafur nafur42 at gmail.com
Thu Jun 21 08:36:12 PDT 2012


Hi,

Here are two examples for more complex replacements:

Q * Q^T (where Q is any subtree) -> I
A x = b (where x is any subtree) -> x = VAR (new variable)

As I have to find such a pattern (Q Q^T) no matter where it occurs
within an arbitrarily shaped multiplication, the recursive construction
does not seem like an option to me.

How would you implement this rule?

Gereon

On 06/14/2012 07:10 PM, Stefan Mätje (U-Boot Mail) wrote:
> Am 14.06.2012 13:59:43 schrieb(en) nafur:
>> Hi,
>>
>> I already thought about this, but this yields new problems. If I build
>> my ASTs like this, I have to take care of all possible multiplications.
> 
> I don't think so. Because the expr rule in my bare bones example is applied 
> recursively and from the bottom of the tree up multiple(!) times by the 
> bottomup rule all it needs to take care for is one operator and two operands 
> (either of them could be an "expr" expression).
> 
>> To match A B I have to match ^(* A B), ^(* A ^(* B ...)), ^(* ^(* ... A) B), 
>> ^(* ^(* ... A) ^(* B ... )).
> 
> Yes, at first they look as a lot of variants but it always boils down to a
> ^(* aExpr bExpr). Please really have a look at the examples for expression 
> evaluation on the ANTLR website. And use the link to the Wiki I provided.
> 
> Have a look at the example AST I suggested below:
> 
> ^(* ^(* ^(* ^(* 5 3) 1) 2) 4)
> 
> ANTLR will match and apply the expr rule 4 times to this AST in this manner:
> 
> 1. match: ^(* 5 3)        ->  ^(* 5 3)
> 2. match: ^(* ^(* 5 3) 1) ->  ^(* 5 3) // real rewrite propagetes only this up
> 3. match  ^(* ^(* 5 3) 2) ->  ^(* ^(* 5 3) 2)  // no change
> 4. match  ^(* ^(* ^(* 5 3) 2) 4)  -> ^(* ^(* ^(* 5 3) 2) 4)
> 
> The 1 has vanished as expected and only 3 multiplications are left.
> 
>> As I will at some point go for rules that match more than two factors,
>> this does not seem like a real option to me...
> 
> What kind of operators do you have that take more than two operands? I
> can't imagine a use case at the moment.
> 
> Stefan
> 
>> Gereon
>>
>> On 06/14/2012 12:00 PM, Stefan Mätje wrote:
>>> Hi nafur,
>>>
>>> from what I've read in this thread till now I got the impression that
>>> you've structured your AST in a way that counteracts your goals.
>>>
>>> I'll try to give a simple example to get us on common ground. I will
>>> refer here only to simple scalar multiplication with an expression like
>>> this:
>>>
>>> "5 * 3 * 1 * 2 * 4"
>>>
>>> As far as I understand from your writings you would build an AST like
>>> the following from this input
>>>
>>> "^(MULT 5 3 1 2 4)
>>>
>>> and try to match it with a rule like this
>>>
>>> topdown :
>>>     ^( MULT (before += .)* x=. (after += .)* )
>>>         { $x.has("IDENTITY") }?
>>>     ->  ^( MULT $before* $after* )
>>> ;
>>>
>>> How should ANTLR generated code distribute the tokens from the AST to
>>> your variables "before", "x" and "after"? You're not giving ANTLR any
>>> hint about it. I guess you would end up with:
>>> before: "5 3 1 2"
>>> x:      "4"
>>> after:  ""    // empty
>>>
>>> And you must to have "x" assigned the "1" to make your code work.
>>>
>>> Doing expression evaluation and simplification is done differently with
>>> ANTLR and leads to ASTs that combine two operands with one operator that
>>> can then be simplified in a straightforward manner.
>>>
>>> A more appropriate AST structure for the before mentioned input would be
>>> this recursive structure:
>>>
>>> ^(* ^(* ^(* ^(* 5 3) 1) 2) 4)
>>>
>>> In a "pretty printed" style it would appear like this.
>>>
>>> ^(*
>>>     ^(*
>>>         ^(*
>>>             ^(*
>>>                 5
>>>                 3
>>>             )
>>>             1
>>>         )
>>>         2
>>>     )
>>>     4
>>> )
>>>
>>> This way you have only to deal with an operator and two operands in one
>>> rule. A bare bones example (not working) looks like this:
>>>
>>> bottomup:
>>>     expr
>>>     ;
>>>
>>> expr:
>>>     ^(* a=expr b=INT {$b.text()== "1"}?)    ->    $a
>>>     ^(* a=INT {$a.text()== "1"}? b=expr)    ->    $b
>>>     ;
>>>
>>> This way you can now match special things in the AST and act on them
>>> testing some properties of the trees. You can see the simplification
>>> that the result of "expr" is one of the operands if one of the others is
>>> a literal "1".
>>>
>>> For a better explanation and an example look at
>>> http://www.antlr.org/wiki/display/~admin/2008/11/30/Example+tree+rewriting
>> +with+patterns
>>>
>>>
>>>
>>> I hope I did not misunderstand what you want to do.
>>>
>>> Stefan
>>>
>>>
>>> Am 13.06.2012 10:18, schrieb nafur:
>>>> Hi,
>>>>
>>>> Ok, I guess I have to somehow redefine the nature of the rewrite rules I
>>>> want to use:
>>>>
>>>> My ASTs represent mathematical formulas and .has() checks, if this
>>>> symbol has some special mathematical property (like "is identity", "is
>>>> symmetric", "is invertible", "is orthogonal", ... I'm working with
>>>> matrices).
>>>>
>>>> Thus, this example is really the simplest one. Another example is:
>>>>
>>>> ^( MULT (before += .)* x=. ^( POW y=. TRANSPOSE) (after += .)* )
>>>> { $x.equals($y)&&  $x.has("ORTHOGONAL") }?
>>>> ->  ^( MULT $before* IDENTITY $after* )
>>>>
>>>> i.e.: replace any Q Q^T by the identity, if Q is orthogonal. Note that
>>>> this rule requires a comparison of whole subtrees ($x.equals($y)).
>>>>
>>>> Therefore, filtering single symbols would work for my first example (I
>>>> somehow selected a quite bad example). For more complex examples, I
>>>> really want to rely on the pattern matching offered by ANTLR.
>>>>
>>>> What I would like: Instead of resolving ambiguity by just choosing a
>>>> single possibility, try all of them (what makes more sense for rewrite
>>>> rules anyway).
>>>>
>>>> Gereon
>>>>
>>>> On 06/13/2012 07:35 AM, Benjamin S Wolf wrote:
>>>>> If I were programming this for a Python target, I'd guess it would
>>>>> look something like:
>>>>>
>>>>> topdown : ^( MULT (arg+=.)* )
>>>>>      ->  { filter_out_identities(MULT, $arg) };
>>>>>
>>>>> with a function filter_out_identities defined in a header somewhere:
>>>>>
>>>>> def filter_out_identities(node, nodes):
>>>>>      tree = CommonTree(node)
>>>>>      tree.addChildren(n for n in nodes if not n.has("IDENTITY"))
>>>>>      return tree
>>>>>
>>>>> I'm not sure what you really want to do. You were asking how to remove
>>>>> all of a token from a list, and you gave an example that indicated you
>>>>> had a function "has" that would let you check whether a token was an
>>>>> identity, and you're matching against every token. Sorry if I
>>>>> misunderstood that.
>>>>>
>>>>> If you only have one token that could be an identity, then you could
>>>>> do something like:
>>>>>
>>>>> topdown : ^( MULT (a+=~IDENTITY)* (IDENTITY* b+=~IDENTITY*)* IDENTITY*
>>>>>      ->  ^( MULT a* b* );
>>>>>
>>>>> and no actions are required.
>>>>>
>>>>> You might even be able to do
>>>>>
>>>>> ^( MULT ( IDENTITY | a+=~IDENTITY )* ) ->  ^( MULT a* )
>>>>>
>>>>> This is based on my understanding of parser rules; I'm not sure how
>>>>> different it would be for tree parser rules.
>>>>>
>>>>> On Tue, Jun 12, 2012 at 10:07 PM, nafur<nafur42 at gmail.com>  wrote:
>>>>>> Hi,
>>>>>>
>>>>>> While I'm not completely sure how this would look like, this would
>>>>>> somehow result in implementing the pattern matching myself, right?
>>>>>> As this example is a very easy one, this is no option for me. Actually,
>>>>>> I'm only using ANTLR because of this pattern matching...
>>>>>>
>>>>>> Gereon
>>>>>>
>>>>>> On 06/13/2012 05:34 AM, Benjamin S Wolf wrote:
>>>>>>> The best way to accomplish this is likely to be an arbitrary
>>>>>>> action, eg.:
>>>>>>>
>>>>>>> a : INT ->  { new CommonTree(new CommonToken(FLOAT, $INT.text +
>>>>>>> ".0")) } ;
>>>>>>>
>>>>>>> This takes an INT and returns a Tree with one node, a FLOAT token with
>>>>>>> text that is the int plus ".0".
>>>>>>>
>>>>>>> You'll want to make a tree via CommonTree() and add a list of children
>>>>>>> that excludes all tokens that have IDENTITY (or add each child if it
>>>>>>> doesn't have the identity). This means you'll be looping over all the
>>>>>>> nodes in the MULT subtree within the action.
>>>>>>>
>>>>>>> On Mon, Jun 11, 2012 at 2:01 AM, nafur<nafur42 at gmail.com>  wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I asked about this ambiguity quite a while ago. Does nobody has
>>>>>>>> any idea
>>>>>>>> how to do this?
>>>>>>>>
>>>>>>>> Gereon
>>>>>>>>
>>>>>>>> On 05/29/2012 12:48 PM, nafur wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm using tree rewriters to manipulate formulas that are
>>>>>>>>> represented as
>>>>>>>>> ASTs. One operation I'd like to do is: "remove any identity from a
>>>>>>>>> multiplication".
>>>>>>>>>
>>>>>>>>> The replacement rule I'm currently working with is the following:
>>>>>>>>>
>>>>>>>>> topdown :
>>>>>>>>>        ^( MULT (before += .)* x=. (after += .)* )
>>>>>>>>>        { $x.has("IDENTITY") }?
>>>>>>>>>        ->  ^( MULT $before* $after* )
>>>>>>>>> ;
>>>>>>>>>
>>>>>>>>> As you probably see, we have some ambiguity here: before and
>>>>>>>>> after can
>>>>>>>>> both take arbitrarily many elements. If I take an AST like
>>>>>>>>> ^( MULT A I x ), the replacement will not match. printing some debug
>>>>>>>>> output reveals, that ANTLR will only check $x.has("IDENTITY"),
>>>>>>>>> i.e. will
>>>>>>>>> only try to match the last element in a multiplication.
>>>>>>>>> (That is actually not a surprise, as ANTLR issues a warning...)
>>>>>>>>>
>>>>>>>>> Is there any way to resolve this ambiguity in a way, such that I can
>>>>>>>>> match all occurrences?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Gereon
>>>>>>>>
>>>>>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>>>>>> Unsubscribe:
>>>>>>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-
>> address
>>>>>>>>
>>>>
>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>> Unsubscribe:
>>>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>>
>>
>>
> 
> 
> 


More information about the antlr-interest mailing list