[antlr-interest] Request for Change regarding Lexer (?)

Gavin Lambert antlr at mirality.co.nz
Thu Feb 18 10:38:11 PST 2010


At 09:27 17/02/2010, I wrote:
 >It'd be nice if there was some way to express a negative match
 >via a syntactic predicate, eg:
 >   FOOLIST: 'foo[' (('foo') => ~ | ID)+ ']';
 >(where '~' in an alt basically means "break", ie. match nothing
 >and terminate the innermost loop.)
 >Or, perhaps better:
 >   FOOLIST: 'foo[' (('foo') ~=> ID)+ ']';
 >(where '~=>' means "only take this path if the predicate 
*fails*")

Another possible syntax might be to allow ~ to act on sequences as 
well as sets.  Then you could use either of these:
   FOOLIST: 'foo[' ((~'foo') => ID)+ ']';
   FOOLIST: 'foo[' ((~('f' 'o' 'o')) => ID)+ ']';

This approach might be even more useful than the first, although 
it's harder to define what the result should be if you try to make 
a sequence out of both negative and positive sequences (ie. what 
should "~'foo' 'bar'" mean?  Is it "any three characters except 
'foo', followed by 'bar'", or "any number of characters except 
'foo', followed by 'bar'"?  Or just meaningless?  What if it were 
"~('foo'|'quux') 'bar'"?  [Assuming this is all at lexer context, 
so these are sequences of characters rather than individual 
tokens.]).



More information about the antlr-interest mailing list