[antlr-interest] Re: Unicode support
meilland78
jean-claude.meilland at experian-scorex.com
Fri May 21 06:44:50 PDT 2004
Hello,
Yes my only requirement is to parse "<asian characters here>".
The language I parse has strings which can contain unicode
characters. But the language itself doesnt need to be in unicode.
I would be happy to have any suggestions on how to do this
(explainations or example).
Thanks in advance,
Best regards,
J.Claude.
--- In antlr-interest at yahoogroups.com, Mark Lentczner <markl at g...>
wrote:
>
> > I have to generate in C++ and I will need it to parse strings with
> > asian languages. So I guess I need some pretty efficiant unicode
> > support.
> I'm not clear here. Is your requirement that you have to parse
> constructs like:
> "<asian characters here>"
> Where the only non-US-ASCII characters appear between quotes? And
that
> the only restriction between those quotes is that it is a sequence
of
> vaild Unicode characters? If so, this is easily doable in with
Antlr
> in C++, if you take the treat your input as UTF-8.
>
> If you need to support identifiers composed of non-US-ASCII
characters,
> it is a bit more difficult, but still doable.
>
> This is exactly what I'm doing: My language is defined over the
full
> Unicode character set, allows Unicode in string literals, comments,
> identifier names, and in a few cases operators (such as the U+F7,
the
> division sign). I lex and parse the language with Antlr,
generating a
> C++ lexer that accepts a UTF-8 encoded Unicode stream of bytes.
>
> I'd be happy to share my work on this.
>
> > Hope 3.0 will be out before end of summer because that's my dead
line.
> I think the time frame is longer than that.
>
> - Mark
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list