[antlr-interest] Request for Change regarding Lexer (?)
kferrio at gmail.com
kferrio at gmail.com
Thu Feb 18 12:27:15 PST 2010
Gavin,
Of your three ideas ...two equivalents in one post and a third more powerful one in this post ... I currently like the second equivalent from the first post...viz ~=> for two reasons. First I think your third form is so powerful as to encourage abuse and expose a lot of strange edge-cases for Terence to anticipate and guard against. Not sure, but seems possible. Second I just like having one token (second equivalent) instead of two tokens (first equivalent) to express the idea cleanly. Admittedly this is a style preference and I have been accused of questionable taste more than once. :) Your idea is helpful I would use it regardless of how it is expressed in antlr.
Cheers,
Kyle
Sent from my Verizon Wireless BlackBerry
-----Original Message-----
From: Gavin Lambert <antlr at mirality.co.nz>
Date: Fri, 19 Feb 2010 07:38:11
To: Terence Parr<parrt at cs.usfca.edu>
Cc: <antlr-interest at antlr.org>
Subject: Re: [antlr-interest] Request for Change regarding Lexer (?)
At 09:27 17/02/2010, I wrote:
>It'd be nice if there was some way to express a negative match
>via a syntactic predicate, eg:
> FOOLIST: 'foo[' (('foo') => ~ | ID)+ ']';
>(where '~' in an alt basically means "break", ie. match nothing
>and terminate the innermost loop.)
>Or, perhaps better:
> FOOLIST: 'foo[' (('foo') ~=> ID)+ ']';
>(where '~=>' means "only take this path if the predicate
*fails*")
Another possible syntax might be to allow ~ to act on sequences as
well as sets. Then you could use either of these:
FOOLIST: 'foo[' ((~'foo') => ID)+ ']';
FOOLIST: 'foo[' ((~('f' 'o' 'o')) => ID)+ ']';
This approach might be even more useful than the first, although
it's harder to define what the result should be if you try to make
a sequence out of both negative and positive sequences (ie. what
should "~'foo' 'bar'" mean? Is it "any three characters except
'foo', followed by 'bar'", or "any number of characters except
'foo', followed by 'bar'"? Or just meaningless? What if it were
"~('foo'|'quux') 'bar'"? [Assuming this is all at lexer context,
so these are sequences of characters rather than individual
tokens.]).
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list