[antlr-interest] Re: multibyte character sets

micheal_jor open.zone at virgin.net
Tue Oct 14 09:08:13 PDT 2003


--- In antlr-interest at yahoogroups.com, "aristeinberg"
<ari.steinberg at e...> wrote:
> Hi, 
> 
> Is there an option or any other way to get ANTLR to work with 
> multibyte character sets?  I have a grammar that works fine except 
> if you change the language ( on win 2000 ) from English to Hangul ( 
> Korean ) and try to pass this into the parser, in which case it 
> crashes the app.

You didn't mention the implementation language of your Lexer/Parser
(ANTLR supports Java, C++ and C#). In any case the answer would be
that ANTLR supports Unicode. 

I think that UTF-8 is the expected encoding so you may have to ensure
your input files are in UTF-8 or are converted into a UTF-8 string buffer.

Cheers,

Micheal



 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list