[antlr-interest] v4 "Honey Badger" teaser

Thu Dec 29 18:27:10 PST 2011

A way to deal with case-insensitivity that is less noisy to read would be a great benefit, but I too was thinking along the lines of Sam:

At 12/29/2011 06:07 PM, Sam Barnett-Cormack wrote:
>Assuming unicode featureset, a proper semantic case insensitivity would 
>be lovely - so the unicode properties were used to determine whether 
>there was a case-insensitive match. Someone might have a use for other 
>unicode matching, though, like base-glyph matching (ignoring diacritics).

... which led me to think that a more flexible way to say "apply case insensitivity to this string" is needed, that could invoke either:

a) one or another built-in transformation, such as standard ASCII case insensitivity:  CI("AB") --> [Aa][Bb], and possibly other built-in standards for a range of unicode character sets.

b) or invokes a user-supplied plug-in: CI("AB", MyTrans) --> whatever MyTrans returns.

c) or, with syntax similar to (b), and to avoid code-language-dependency, invokes something specified elsewhere in the grammar file using regex or whatever.

I'm not particularly advocating the above syntax, just the general idea of facilitating shorthands for generating the fully-spelled-out series of character sets, and also advocating trying to avoid special-casing one particular variety of case-insensitivity within ANTLR syntax. 

Hmmm, this is sliding perilously close to ANTLR preprocessor.  :-)

-- Graham