[antlr-interest] Non-disjoint tokens

Gavin Lambert antlr at mirality.co.nz
Sun Dec 2 23:27:23 PST 2007


At 17:24 3/12/2007, Steve Bennett wrote:
 >QUESTION: Why doesn't putting ~DIGITS in the syntactic predicate 

 >work?

Because you can only invert sets, not sequences.  In other words, 
this works:

fragment DIGIT: '0'..'9';
DIGITS: DIGIT+;
NONDIGITS: (~DIGIT)+;

but this doesn't:

DIGITS: ('0'..'9')+;
NONDIGITS: ~DIGITS;

Note that ('a' | 'b') is still a set, so can be inverted; 'ab' is 
a sequence, and can't be.  (And all of these examples assume 
you're in the lexer -- the rule is the same in the parser but it 
presents itself differently, since each item in the set can be a 
complete token instead of just a single character.  Though 
inverting in the parser isn't common anyway.)



More information about the antlr-interest mailing list