[antlr-interest] Re: Unicode support

Fri May 21 06:44:50 PDT 2004

Hello,

Yes my only requirement is to parse "<asian characters here>".
The language I parse has strings which can contain unicode 
characters. But the language itself doesnt need to be in unicode.

I would be happy to have any suggestions on how to do this 
(explainations or example).

Thanks in advance,

Best regards,

J.Claude.

--- In antlr-interest at yahoogroups.com, Mark Lentczner <markl at g...> 
wrote:
> 
> > I have to generate in C++ and I will need it to parse strings with
> > asian languages. So I guess I need some pretty efficiant unicode
> > support.
> I'm not clear here.  Is your requirement that you have to parse 
> constructs like:
> 	"<asian characters here>"
> Where the only non-US-ASCII characters appear between quotes?  And 
that 
> the only restriction between those quotes is that it is a sequence 
of 
> vaild Unicode characters?  If so, this is easily doable in with 
Antlr 
> in C++, if you take the treat your input as UTF-8.
> 
> If you need to support identifiers composed of non-US-ASCII 
characters, 
> it is a bit more difficult, but still doable.
> 
> This is exactly what I'm doing: My language is defined over the 
full 
> Unicode character set, allows Unicode in string literals, comments, 
> identifier names, and in a few cases operators (such as the U+F7, 
the 
> division sign).  I lex and parse the language with Antlr, 
generating a 
> C++ lexer that accepts a UTF-8 encoded Unicode stream of bytes.
> 
> I'd be happy to share my work on this.
> 
> > Hope 3.0 will be out before end of summer because that's my dead 
line.
> I think the time frame is longer than that.
> 
> 	- Mark

Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/