[antlr-interest] Unicode character classes

Christian chwchw at gmx.de
Thu Dec 15 13:14:19 PST 2011


Thanks for your solution. However, there are unicode char classes that
cannot be easily expressed by the range operator. Some have "holes".
Furthermore, I have to explicitly define a rule for each class I want to
use. So, again, does ANTLR provide unicode char class support?

Regards,
Christian

Am 15.12.2011 21:32, schrieb Bart Kiers:
> Hi Christian,
>
> Sure. 
> For example, the following rule: 
>
>     LatinExtB_first4 : '\u0180'..'\u0184';
>
> will match any of the first 4 Latin Extended-B* characters.
>
> Regards,
>
> Bart.
>
>
> * http://en.wikipedia.org/wiki/List_of_Unicode_characters#Latin_Extended-B
>
>
> On Thu, Dec 15, 2011 at 9:23 PM, Christian <chwchw at gmx.de
> <mailto:chwchw at gmx.de>> wrote:
>
>     Hi community,
>
>     I've read a cuple of threads but all questions whether ANTLR supports
>     Unicode character classes are not answered. Therefore, I now pose the
>     question:
>
>     Does ANTLR support unicode character classes? And if not, how can I
>     easily put them into a lexer grammar anyway?
>
>     Regards,
>     Christian
>
>     List: http://www.antlr.org/mailman/listinfo/antlr-interest
>     Unsubscribe:
>     http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>



More information about the antlr-interest mailing list