[antlr-interest] UTF-8 input?

Jim Idle jimi at temporal-wave.com
Fri Jan 22 12:06:31 PST 2010


Do you not see the function call:

ConvertUTF8toUTF16() ?

In the file called 'antlr3convertutf.c" ?

Jim



> -----Original Message-----
> From: Xie, Linlin [mailto:linlin.xie at siemens.com]
> Sent: Friday, January 22, 2010 4:58 AM
> To: Jim Idle; antlr-interest at antlr.org
> Subject: RE: [antlr-interest] UTF-8 input?
> 
> Hi jim,
> 
> Thanks for the reply. You said I can convert my UTF8 input "to UCS2
> using the supplied converter in the current runtime", but I can't find
> any such converter in antlr c runtime. Can you suggest me which API to
> use? Btw, I searched the archive, I can see the person who had similar
> problem as mine used iconv library on linux.
> 
> Thanks in advance!
> Linlin
> 
> 
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Jim Idle
> Sent: 20 January 2010 16:31
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] UTF-8 input?
> 
> You need to remember to state which target you are talking about.
> 
> I have written a new universal input stream for the next version of the
> C runtime. It takes 8bit, 16 bit, UTF-8, UTF-16, UCS2, UTF32 and EBCDIC
> (code gen will change slightly to support this). It is not well tested
> right now but will be available as a snapshot 3.3 release shortly in
> the
> downloads page.
> 
> In the meantime the easiest thing to do is to convert to UCS2 using the
> supplied converter in the current runtime. Though this will not work
> with surrogate pairs in UTF-16 though but most people do not need that.
> 
> If you really need UTf-8 without conversion then it is easy enough to
> write, or you can just steal the code from my check in of the code in
> about 10 minutes. Note that while the streams work, I have not provided
> ANTLR3_STRING support for UTF-8 and so on yet and so getting $text from
> such a stream may or may not work,
> 
> Jim
> 
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Xie, Linlin
> > Sent: Wednesday, January 20, 2010 3:32 AM
> > To: antlr-interest at antlr.org
> > Subject: [antlr-interest] UTF-8 input?
> >
> > Can anyone tell me if antlr3.1.3 generated parser works with UTF-8
> > input? If it does, how should I configure in the grammar? I noticed
> > there are two macros ANTLR3_INLINE_INPUT_ASCII and
> > ANTLR3_INLINE_INPUT_UTF16, but no UTF-8 one.
> >
> >
> >
> > Many thanks!
> >
> > Linlin
> >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-
> interest/your-
> > email-address
> 
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address





More information about the antlr-interest mailing list