[antlr-interest] The NOT (~) Operator
Gavin Lambert
antlr at mirality.co.nz
Sat Apr 12 06:14:38 PDT 2008
At 20:10 12/04/2008, Sven Busse wrote:
>INDENTATION
> : TAB* ~NEWLINE;
>
>NEWLINE
> : '\r'? '\n';
>
>fragment
>TAB : '\t';
>
>Checking the grammar with ANTLRWorks gives me this error:
>
>simpletest.g:0:0: syntax error: buildnfa: <AST>:6:11: unexpected
>AST node: ?
>
>The problem seems to relate to the "~NEWLINE", because if i
delete
>it, i get no error. Also, if i change the "INDENTATION" to a
parser
>rule "indentation", i get no error, but that is not an option
for me.
>
>Can someone explain to me, what the reason behind this error is?
In the lexer, "~" inverts a "set" of characters (a group of single
character alternatives). It cannot be used on a "sequence" (one
or more characters following another character).
In the parser, "~" similarly operates on "sets", but this time
they're sets of tokens. Just like the lexer, though, it can't be
used on a sequence of tokens.
Since NEWLINE is a single token, it's valid to invert it in the
parser level (you're saying "any token except NEWLINE"). At the
lexer level you can't invert it though -- that would be translated
as "any single character except the sequence of '\n' optionally
preceded by a '\r'", which doesn't really make sense.
In this case, your best bet is probably to spell it out
explicitly:
INDENTATION
: TAB ~('\r' | '\n')
;
... but bear in mind that this will consume whichever non-newline
character follows the tab (making it part of the INDENTATION
token), which may not be what you really want.
More information about the antlr-interest
mailing list