[antlr-interest] All except...

Phil Oliver antlr at olivercomputing.com
Mon Jun 4 16:26:36 PDT 2007


This might be a terribly simple question though I've looked pretty 
hard and can't find an obvious solution nor examples in sample 
grammars: how does one create an ANTLR v3 rule (either lexer or 
parser) that easily matches any character EXCEPT a set of other 
characters? e.g. let's say I have:

Char : '\u0009' | '\u000A' | '\u000D' | '\u0020'..'\uD7FF' | 
'\uE000'..'\uFFFD';

and I want to define a rule that matches any Char except another list 
of characters. In the EBNF grammar used in the XQuery spec, for 
example, it would be:

Char2: Char - ('<' | '>');

which would cause Char2 to match any character in Char except for '<' 
or '>'.  But that operator isn't part of ANTLR (evidently). I've 
looked at the ~ unary operator but that doesn't handle this job, 
unless I'm overlooking something.

I understand that one could explicitly define Char2 to be an 
allowable sequence from Char that already excludes '<' and '>' but 
that's tedious and hides the abstract meaning of the definition.



More information about the antlr-interest mailing list