[antlr-interest] v4 bug: &x and &~x are including match in token
Peter Boughton
boughtonp at gmail.com
Sat Jan 21 11:26:50 PST 2012
My understanding of the & operator is intended to act as a lookahead -
ensuring the following content matches, but not including it in the
token text.
( as descibed here:
http://www.antlr.org/wiki/display/~admin/ANTLR+v4+lexers#ANTLRv4lexers-Requirements
)
However, this is not the behaviour I'm seeing - I'm getting the
lookahead match text included as part of the token (which prevents it
from being included in the next token, and thus causes problems).
OUT_ATTR_ENABLE_OUTPUT
: 'output' WS* EQUALS WS* ATTR_TRUE
| 'output' WS+ &~'='
| 'output' &'>'
{ OutputEnabled = true; }
;
Sample input:
<cffunction output> #Special# </cffunction>
<cffunction output > #Special# </cffunction>
<cffunction output anotherattr > #Special# </cffunction>
Captured token:
OUT_ATTR_ENABLE_OUTPUT = [output>]
OUT_ATTR_ENABLE_OUTPUT = [output >]
OUT_ATTR_ENABLE_OUTPUT = [output a]
I have used &~x in other situations and it seemed to work, although
maybe they were just ones where it didn't matter when the lookahead
match was included.
More information about the antlr-interest
mailing list