[antlr-interest] Using JFlex with Antlr

Johannes Luber jaluber at gmx.de
Wed Oct 3 15:41:34 PDT 2007


Andreas Ravnestad wrote:
> Hi Johannes,
> 
> I ran into problems with \u10000 and above. Perhaps this has been solved
> now? I am sorry for my unfortunate phrasing, I rather meant that the
> support is lacking, in some ways :)
> 
> -Andreas
> 

That's a problem with the unicode handling with Java itself primarily,
and only secondly with ANTLR. Java uses internly UTF-16 and until 1.5 it
 couldn't even use the extended character set. AFAIK, Java still hasn't
an equivalent to the C# construct '\U00010000', so you have to create
the surrogate pair presentation yourself (that is described in the FAQ
on the Unicode webpage). I also regret that ANTLR doesn't let the users
circumvent this step for now - would be really useful.

Best regards,
Johannes Luber


More information about the antlr-interest mailing list