[antlr-interest] greedy=false for lexersBy default
Gavin Lambert
antlr at mirality.co.nz
Sat May 24 05:32:32 PDT 2008
At 10:32 24/05/2008, Terence Parr wrote:
> I'm thinking of changing lexers to use greedy=false by default
so
>
>that things like
>
>STRING : '"' ('\\' '"'|.)* '"' ;
>
> so I don't have to say
>
>STRING : '"' (options {greedy=false;}:'\\' '"'|.)* '"' ;
Provided it only goes non-greedy if there's a following
character. Otherwise I think it'd lead to too much change in
behaviour. (And I'm not sure what makes sense if the following
character can be optional.)
But yeah, like Loring I almost never use a . in this sort of case;
it's more common to do something along the lines of:
STRING : '"' ('\\' ('"' | '\\' | 'r' | 'n') | ~('\\' | '"'))* '"';
Very explicit and I think it makes greediness or non-greediness
irrelevant too. Alternatively, if I don't want the lexer to choke
if an invalid escape sequence is used, I'll use the simpler form:
STRING : '"' ('\\' . | ~('\\' | '"'))* '"';
Admittedly this approach does get a bit messy when the termination
sequence consists of multiple characters (eg. '*/' for a C-like
block comment or ']]>' for XML CDATA). That's when an
auto-non-greedy approach might be beneficial.
More information about the antlr-interest
mailing list