[antlr-interest] Unicode handling
John D. Mitchell
johnm-antlr at non.net
Wed Apr 21 16:31:14 PDT 2004
>>>>> "Mark" == Mark Lentczner <markl at glyphic.com> writes:
[...]
> Does anyone see any pitfalls to this other than increasing the look ahead
> for the lexer? Since in our source language, all the meaningful
> punctuation is in the visible US-ASCII range, the only place the
> difference between parsing Unicode characters vs. UTF-8 encoded Unicode
> characters would be in things like the NAME token production.
> This seems much more preferable to me than extending the C++ support with
> some Unicode library (like IBM's icu or some such).
I concur.
In fact, I almost took that same approach but I was able to dodge the
Unicode bullet completely. :-)
For Antlr v3, aside from my perennial haranguing for complete and proper
hoisting support, I really want to get rid of all of this ridiculous use of
in-band signalling. Please join me in pestering Ter about this. :-)
Have fun,
John
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list