[antlr-interest] Unicode Character Class Support

chris king kingces95 at gmail.com
Thu Dec 30 11:56:41 PST 2010


Hello, is there any support for matching Unicode character classes?

http://msdn.microsoft.com/en-us/library/20bw873z.aspx


The Unicode standard assigns each character a general category. For example,
a particular character can be an uppercase letter (represented by the
Lu category),
a decimal digit (the Nd category), a math symbol (the Sm category), or a
paragraph separator (the Zl category). Specific character sets in the
Unicode standard also occupy a specific range or block of consecutive code
points. For example, the basic Latin character set is found from \u0000
through \u007F, while the Arabic character set is found from \u0600 through
\u06FF


Thanks,
Chris


More information about the antlr-interest mailing list