[antlr-interest] Re: multibyte character sets

Ric Klaren klaren at cs.utwente.nl
Tue Oct 14 15:11:39 PDT 2003


On Tue, Oct 14, 2003 at 05:23:17PM -0000, aristeinberg wrote:
> It is a C++ app and we are not compiling in unicode.  If I compile 
> using ASCII, will the parser still accept UTF-8?

Atm it's not possible to have a full featured parser in C++ with unicode
support. Unless you want to do a lot of hacking/debugging. I refer to the
archive for many previous discussions. 

The support lib needs a rewrite before that will work. UTF-8 will probably
work very bad. You need to modify the support lib to use wchar's and
wstrings it's been done but you may run into trouble in places. The
analysis engine supports it the support lib does not.

Cheers,

Ric
-- 
-----+++++*****************************************************+++++++++-------
    ---- Ric Klaren ----- j.klaren at utwente.nl ----- +31 53 4893722  ----
-----+++++*****************************************************+++++++++-------
   Words fly like arrows
      as if we knew what was right and wrong. --- Chuang Tsu


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list