[antlr-interest] v4 "Honey Badger" teaser

Terence Parr parrt at cs.usfca.edu
Sat Dec 31 11:10:30 PST 2011


I'm happy to think about this after the first release when I have more time.
Ter
On Dec 31, 2011, at 7:46 AM, Sam Barnett-Cormack wrote:

> I'm not sure it need be complexity so much as what many language guides I've read refer to as "syntactic sugar". This is exactly the kind of situation that syntactic sugar makes a language much more usable.
> 
> However, I will agree it's not that common these days. I still think a basic bit of syntactic sugar would be worthwhile.
> 
> By syntactic sugar, I mean something that could be acheived with [..], but providing a shortcut for common cases - such as case insensitivity. It's worth noting that there are uses other than case insensitivity for [..] syntax, just look how it gets used in regex. As an aside, isn't it weird the modern regex are still called regex, when they really aren't regular in a grammatic sense any more?
> 
> Sam
> 
> On 31/12/2011 00:44, Terence Parr wrote:
>> Hi Graham and crew…Fortunately case insensitive keywords are less common these days. Not sure it's worth adding some complexity to deal with it when the […] thing it is okay.
>> Ter
>> On Dec 29, 2011, at 6:27 PM, Graham Wideman wrote:
>> 
>>> A way to deal with case-insensitivity that is less noisy to read would be a great benefit, but I too was thinking along the lines of Sam:
>>> 
>>> At 12/29/2011 06:07 PM, Sam Barnett-Cormack wrote:
>>>> Assuming unicode featureset, a proper semantic case insensitivity would
>>>> be lovely - so the unicode properties were used to determine whether
>>>> there was a case-insensitive match. Someone might have a use for other
>>>> unicode matching, though, like base-glyph matching (ignoring diacritics).
>>> 
>>> ... which led me to think that a more flexible way to say "apply case insensitivity to this string" is needed, that could invoke either:
>>> 
>>> a) one or another built-in transformation, such as standard ASCII case insensitivity:  CI("AB") -->  [Aa][Bb], and possibly other built-in standards for a range of unicode character sets.
>>> 
>>> b) or invokes a user-supplied plug-in: CI("AB", MyTrans) -->  whatever MyTrans returns.
>>> 
>>> c) or, with syntax similar to (b), and to avoid code-language-dependency, invokes something specified elsewhere in the grammar file using regex or whatever.
>>> 
>>> I'm not particularly advocating the above syntax, just the general idea of facilitating shorthands for generating the fully-spelled-out series of character sets, and also advocating trying to avoid special-casing one particular variety of case-insensitivity within ANTLR syntax.
>>> 
>>> Hmmm, this is sliding perilously close to ANTLR preprocessor.  :-)
>>> 
>>> -- Graham
>>> 
>>> 
>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>> 
>> 
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list