>> A few things that would be interesting to add:
>> Allow you to reference sets like JAVA_IDENTIFIER or LATIN_... and then
>> characters like 'GREATER-THAN SIGN' and 'APOSTROPHE-QUOTE'.  The later
> That would be cool. For the OpenJMS selector grammar I currently have
> protected rules corresponding to the Character.isJavaIdentifierStart() 
> and
> Character.isJavaIdentifierPart() methods - being able to replace these
> long (and non-obvious) rules would be great.

Yep, those obvious ones for ID and such would be easy but would it be 
worth it just to do those?

>> would be easy: just a hashtable lookup if I can find the unicode char
>> index in Java somewhere ;)  The former is harder as there is nothing 
>> in
>> Java's Character.java class that lets me get a set of chars for say
>> GREEK_EXTENDED.  Anybody know a good library that would give me a set
>> of chars from these char class names?  I've just found:
> You could derive them using Character.UnicodeBlock, and a bit of brute
> force i.e, iterate through all possible chars, invoking
> UnicodeBlock.of(char),
> and populate a set corresponding to the returned UnicodeBlock.

Again though we'd have to do it for every char class...I guess that 
ain't bad ;)

