[antlr-interest] Exploit ambiguity in tree rewriter

Benjamin S Wolf jokeserver at gmail.com
Tue Jun 12 22:35:12 PDT 2012


If I were programming this for a Python target, I'd guess it would
look something like:

topdown : ^( MULT (arg+=.)* )
    -> { filter_out_identities(MULT, $arg) };

with a function filter_out_identities defined in a header somewhere:

def filter_out_identities(node, nodes):
    tree = CommonTree(node)
    tree.addChildren(n for n in nodes if not n.has("IDENTITY"))
    return tree

I'm not sure what you really want to do. You were asking how to remove
all of a token from a list, and you gave an example that indicated you
had a function "has" that would let you check whether a token was an
identity, and you're matching against every token. Sorry if I
misunderstood that.

If you only have one token that could be an identity, then you could
do something like:

topdown : ^( MULT (a+=~IDENTITY)* (IDENTITY* b+=~IDENTITY*)* IDENTITY*
    -> ^( MULT a* b* );

and no actions are required.

You might even be able to do

^( MULT ( IDENTITY | a+=~IDENTITY )* ) -> ^( MULT a* )

This is based on my understanding of parser rules; I'm not sure how
different it would be for tree parser rules.

On Tue, Jun 12, 2012 at 10:07 PM, nafur <nafur42 at gmail.com> wrote:
> Hi,
>
> While I'm not completely sure how this would look like, this would
> somehow result in implementing the pattern matching myself, right?
> As this example is a very easy one, this is no option for me. Actually,
> I'm only using ANTLR because of this pattern matching...
>
> Gereon
>
> On 06/13/2012 05:34 AM, Benjamin S Wolf wrote:
>> The best way to accomplish this is likely to be an arbitrary action, eg.:
>>
>> a : INT -> { new CommonTree(new CommonToken(FLOAT, $INT.text + ".0")) } ;
>>
>> This takes an INT and returns a Tree with one node, a FLOAT token with
>> text that is the int plus ".0".
>>
>> You'll want to make a tree via CommonTree() and add a list of children
>> that excludes all tokens that have IDENTITY (or add each child if it
>> doesn't have the identity). This means you'll be looping over all the
>> nodes in the MULT subtree within the action.
>>
>> On Mon, Jun 11, 2012 at 2:01 AM, nafur <nafur42 at gmail.com> wrote:
>>> Hi,
>>>
>>> I asked about this ambiguity quite a while ago. Does nobody has any idea
>>> how to do this?
>>>
>>> Gereon
>>>
>>> On 05/29/2012 12:48 PM, nafur wrote:
>>>> Hi,
>>>>
>>>> I'm using tree rewriters to manipulate formulas that are represented as
>>>> ASTs. One operation I'd like to do is: "remove any identity from a
>>>> multiplication".
>>>>
>>>> The replacement rule I'm currently working with is the following:
>>>>
>>>> topdown :
>>>>       ^( MULT (before += .)* x=. (after += .)* )
>>>>       { $x.has("IDENTITY") }?
>>>>       -> ^( MULT $before* $after* )
>>>> ;
>>>>
>>>> As you probably see, we have some ambiguity here: before and after can
>>>> both take arbitrarily many elements. If I take an AST like
>>>> ^( MULT A I x ), the replacement will not match. printing some debug
>>>> output reveals, that ANTLR will only check $x.has("IDENTITY"), i.e. will
>>>> only try to match the last element in a multiplication.
>>>> (That is actually not a surprise, as ANTLR issues a warning...)
>>>>
>>>> Is there any way to resolve this ambiguity in a way, such that I can
>>>> match all occurrences?
>>>>
>>>> Thanks,
>>>> Gereon
>>>
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list