[antlr-interest] Re: multibyte character sets

aristeinberg ari.steinberg at embarcadero-ca.com
Tue Oct 14 10:23:17 PDT 2003


It is a C++ app and we are not compiling in unicode.  If I compile 
using ASCII, will the parser still accept UTF-8?

Thanks,
Ari

--- In antlr-interest at yahoogroups.com, "micheal_jor" 
<open.zone at v...> wrote:
> --- In antlr-interest at yahoogroups.com, "aristeinberg"
> <ari.steinberg at e...> wrote:
> > Hi, 
> > 
> > Is there an option or any other way to get ANTLR to work with 
> > multibyte character sets?  I have a grammar that works fine 
except 
> > if you change the language ( on win 2000 ) from English to 
Hangul ( 
> > Korean ) and try to pass this into the parser, in which case it 
> > crashes the app.
> 
> You didn't mention the implementation language of your Lexer/Parser
> (ANTLR supports Java, C++ and C#). In any case the answer would be
> that ANTLR supports Unicode. 
> 
> I think that UTF-8 is the expected encoding so you may have to 
ensure
> your input files are in UTF-8 or are converted into a UTF-8 string 
buffer.
> 
> Cheers,
> 
> Micheal


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list