[antlr-interest] Re: proposal for 2.7.4: charVocabulary defaults to ascii 1..127

Oliver Zeigermann oliver at zeigermann.de
Sat May 1 14:48:31 PDT 2004


I always seem to be the one causing confusion. Let me try to make my 
point clear:

I understand ANTLR operates on characters, not ony bytes, is that right? 
So, ANTLR does not need to worry about character encoding.  Now, ASCII 
*and* UTF-8 *and* UTF-16 are ways to encode characters in bytes while 
Unicode is merely a mapping from integers to characters. Sorry, if this 
sounds picks, but when Terence was talking about ASCII I guess he meant 
the first 127 Unicode characters. When I was talking about ISO-8859-1 I 
made the same mistake and actually meant the first 255 Unicode characters.


Sorry again, for causing confusion :(

Oliver

lgcraymer wrote:
> Oliver--
> 
> Ok, so maybe I should have said
> 
> charVocabulary = "UTF-8";
> 
> and UTF-16.  The point is more that named character sets have an
> advantage in that error messages can be issued.  Ter's example of
> "Korean" is one that would pretty clearly not be recognized.  Many of
> the vocabulary problems are failure to specify a range, but "Does
> ANTLR support unicode" is a close second.
> 
> --Loing
> 
> 
> --- In antlr-interest at yahoogroups.com, Oliver Zeigermann <oliver at z...>
> wrote:
> 
>>lgcraymer wrote:
>>
>>
>>>Ter--
>>>
>>>How about taking the next step?  That is, support
>>>
>>>charVocabulary = "ASCII";
>>>
>>>and
>>>
>>>charVocabulary = "unicode";
>>
>>How would that look like? UTF-8? UTF-16? Something else?
>>
>>Oliver
> 
> 
> 
> 
>  
> Yahoo! Groups Links
> 
> 
> 
>  
> 
> 



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list