[antlr-interest] Unicode XID_Start/XID_Continue? (And, other Unicode properties)
Gavin Lambert
antlr at mirality.co.nz
Sat Jul 5 15:59:48 PDT 2008
At 10:47 6/07/2008, Joe wrote:
>So they are unsupported. And apparently UTF-16 isn't even really
>supported. Shouldn't this stuff be fairly easy to implement? The
>java version of LA already returns an int, so why not add UTF-16
>decoding to it? And properties could be implemented via ICU
While you cannot directly specify Unicode characters in an ANTLR
grammar (because grammars are parsed by ANTLR 2, which doesn't
understand Unicode), ANTLR 3 handles Unicode just fine. So while
you need to use Unicode escapes in your grammar you shouldn't have
problems defining rules for any characters you want to recognise
in your own lexers.
More information about the antlr-interest
mailing list