[antlr-interest] Re: proposal for 2.7.4: charVocabulary defaults
to ascii 1..127
Oliver Zeigermann
oliver at zeigermann.de
Sat May 1 14:48:31 PDT 2004
I always seem to be the one causing confusion. Let me try to make my
point clear:
I understand ANTLR operates on characters, not ony bytes, is that right?
So, ANTLR does not need to worry about character encoding. Now, ASCII
*and* UTF-8 *and* UTF-16 are ways to encode characters in bytes while
Unicode is merely a mapping from integers to characters. Sorry, if this
sounds picks, but when Terence was talking about ASCII I guess he meant
the first 127 Unicode characters. When I was talking about ISO-8859-1 I
made the same mistake and actually meant the first 255 Unicode characters.
Sorry again, for causing confusion :(
Oliver
lgcraymer wrote:
> Oliver--
>
> Ok, so maybe I should have said
>
> charVocabulary = "UTF-8";
>
> and UTF-16. The point is more that named character sets have an
> advantage in that error messages can be issued. Ter's example of
> "Korean" is one that would pretty clearly not be recognized. Many of
> the vocabulary problems are failure to specify a range, but "Does
> ANTLR support unicode" is a close second.
>
> --Loing
>
>
> --- In antlr-interest at yahoogroups.com, Oliver Zeigermann <oliver at z...>
> wrote:
>
>>lgcraymer wrote:
>>
>>
>>>Ter--
>>>
>>>How about taking the next step? That is, support
>>>
>>>charVocabulary = "ASCII";
>>>
>>>and
>>>
>>>charVocabulary = "unicode";
>>
>>How would that look like? UTF-8? UTF-16? Something else?
>>
>>Oliver
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list