[antlr-interest] Unicode character classes

Bart Kiers bkiers at gmail.com
Thu Dec 15 13:38:11 PST 2011


Ah, I thought you meant the regex type of "character class", but you meant
something like `\p{CLASS_NAME}`...

Bart.


On Thu, Dec 15, 2011 at 10:14 PM, Christian <chwchw at gmx.de> wrote:

> Thanks for your solution. However, there are unicode char classes that
> cannot be easily expressed by the range operator. Some have "holes".
> Furthermore, I have to explicitly define a rule for each class I want to
> use. So, again, does ANTLR provide unicode char class support?
>
> Regards,
> Christian
>
> Am 15.12.2011 21:32, schrieb Bart Kiers:
> > Hi Christian,
> >
> > Sure.
> > For example, the following rule:
> >
> >     LatinExtB_first4 : '\u0180'..'\u0184';
> >
> > will match any of the first 4 Latin Extended-B* characters.
> >
> > Regards,
> >
> > Bart.
> >
> >
> > *
> http://en.wikipedia.org/wiki/List_of_Unicode_characters#Latin_Extended-B
> >
> >
> > On Thu, Dec 15, 2011 at 9:23 PM, Christian <chwchw at gmx.de
> > <mailto:chwchw at gmx.de>> wrote:
> >
> >     Hi community,
> >
> >     I've read a cuple of threads but all questions whether ANTLR supports
> >     Unicode character classes are not answered. Therefore, I now pose the
> >     question:
> >
> >     Does ANTLR support unicode character classes? And if not, how can I
> >     easily put them into a lexer grammar anyway?
> >
> >     Regards,
> >     Christian
> >
> >     List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >     Unsubscribe:
> >
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >
> >
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list