[antlr-interest] Request for Change regarding Lexer (?)

kferrio at gmail.com kferrio at gmail.com
Thu Feb 18 12:27:15 PST 2010


Gavin,

Of your three ideas ...two equivalents in one post and a third more powerful one in this post ... I currently like the second equivalent from the first post...viz ~=> for two reasons.  First I think your third form is so powerful as to encourage abuse and expose a lot of strange edge-cases for Terence to anticipate and guard against.  Not sure, but seems possible.  Second I just like having one token (second equivalent) instead of two tokens (first equivalent) to express the idea cleanly.  Admittedly this is a style preference and I have been accused of questionable  taste more than once.  :)  Your idea is helpful I would use it regardless of how it is expressed in antlr.

Cheers,
Kyle 

Sent from my Verizon Wireless BlackBerry

-----Original Message-----
From: Gavin Lambert <antlr at mirality.co.nz>
Date: Fri, 19 Feb 2010 07:38:11 
To: Terence Parr<parrt at cs.usfca.edu>
Cc: <antlr-interest at antlr.org>
Subject: Re: [antlr-interest] Request for Change regarding Lexer (?)

At 09:27 17/02/2010, I wrote:
 >It'd be nice if there was some way to express a negative match
 >via a syntactic predicate, eg:
 >   FOOLIST: 'foo[' (('foo') => ~ | ID)+ ']';
 >(where '~' in an alt basically means "break", ie. match nothing
 >and terminate the innermost loop.)
 >Or, perhaps better:
 >   FOOLIST: 'foo[' (('foo') ~=> ID)+ ']';
 >(where '~=>' means "only take this path if the predicate 
*fails*")

Another possible syntax might be to allow ~ to act on sequences as 
well as sets.  Then you could use either of these:
   FOOLIST: 'foo[' ((~'foo') => ID)+ ']';
   FOOLIST: 'foo[' ((~('f' 'o' 'o')) => ID)+ ']';

This approach might be even more useful than the first, although 
it's harder to define what the result should be if you try to make 
a sequence out of both negative and positive sequences (ie. what 
should "~'foo' 'bar'" mean?  Is it "any three characters except 
'foo', followed by 'bar'", or "any number of characters except 
'foo', followed by 'bar'"?  Or just meaningless?  What if it were 
"~('foo'|'quux') 'bar'"?  [Assuming this is all at lexer context, 
so these are sequences of characters rather than individual 
tokens.]).


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list