[antlr-interest] proposal: make .* and .+ always nongreedy

Gary R. Van Sickle g.r.vansickle at att.net
Sun Mar 22 23:49:44 PDT 2009


> From:  George S. Cowan
>
> Certainly make '.*' and '.+' consistent across all the 
> grammar types, one way or the other.
>

I see this as the paramount consideration.  There is nothing more irritating
and defect-causing in a language than things that are "almost consistent".
 
> 
> I suggest deprecating the current notation and using:
>   (...)** and (...)++ for greedy
>   (...)|* and (...)|+ for nongreedy 
>     (or some other notation that indicates "stop me, quick")
> 

I second Ter's distaste for not introducing new syntax.  Perl regex's have a
well-established way of doing this:

.*  = greedy
.*? = non-greedy.

I can't think of a defensible argument for not simply adopting this
established syntax.  I don't follow the argument for changing the default to
be non-greedy (in fact I don't see the rationale stated anywhere).  This way
the existing syntax and semantics stay the same, so nobody's existing
parsers get broken, and by adopting the same syntax/symantics of Perl
regexes, we creep ever closer to the day when we can all use these glorious
regexes in our ANTLR lexers.  It's win/win/win.

-- 
Gary R. Van Sickle
 



More information about the antlr-interest mailing list