[antlr-interest] Tree matching: wildcard-operator

Tobias Diez webmaster at altertoby.de
Fri Jul 29 15:08:28 PDT 2011


I digged a little bit deeper in this problem and there seems to be a serious
problem with wildcards.

Not even the rule
^(ENVIRONMENT_BEGIN .) . ^(ENVIRONMENT_END .)

matches something like 
 (ENVIRONMENT_BEGIN test1) (ENVIRONMENT_END test1) 
(where nothing is between these two nodes)

Another issue emerges if one further reduces the rule to 
^(ENVIRONMENT_BEGIN) . ^(ENVIRONMENT_END) -> ^(ENVIRONMENT)
Then I get a "Wildcard invalid as root; wildcard can itself be a tree"
error.
So maybe the first problem is related to the second one, although it
compiles well.

Any thoughts how two match anything between two given nodes?
Thanks!


-----Original Message-----
From: Tobias Diez [mailto:webmaster at altertoby.de] 
Sent: Dienstag, 26. Juli 2011 00:02
To: antlr-interest at antlr.org
Subject: Tree matching: wildcard-operator

Hi,

are there special concerns regarding the wildcard operator in a tree grammar
(with filter=true)?

I ask because 
^(ENVIRONMENT_BEGIN .) . * ^(ENVIRONMENT_END .)

did not match the correct input EXCEPT the LAST subtree in the file is of
the form "^(ENVIRONMENT_END .)".
A deeper investigation showed that the "next token equals endtoken than
cancel"-logic of the wildcard did not fire, although it should recognize the
environment-end subtree. So the .* matchs everything and the parser runs
until reaching the last subtree of the input stream. This last subtree is
than compared to  "^(ENVIRONMENT_END .)".
So "^(ENVIRONMENT_BEGIN a) b ^(ENVIRONMENT_END c)" is recognized but
"^(ENVIRONMENT_BEGIN a) b ^(ENVIRONMENT_END c) SOMETHING" not.

Is everything correct in the code above or is there a bug in ANTLR?

Thanks!



More information about the antlr-interest mailing list