[antlr-interest] Unicode escapes in C++
Kochismo
kochismo at gmail.com
Tue Nov 7 07:36:08 PST 2006
Hi,
I'm interested in parsing a plain ascii file which represents unicode
characters as escaped hex digits. For example:
blah\uff20\uff30blah
is the string blah, unicode character #ff20, unicode character #ff30, then
blah. Recognising it with the lexer is simple enough, but the lexer returns
tokens as C++ strings, rather than unicode friendly wstrings. Is there a
way I can handle this from within the lexer? Or will I have to write code
to convert the string token into a wstring?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20061107/60854c1e/attachment.html
More information about the antlr-interest
mailing list