[antlr-interest] Re: proposal for 2.7.4: charVocabulary defaults to ascii 1..127

Oliver Zeigermann oliver at zeigermann.de
Sun May 2 14:13:24 PDT 2004


I know this is leading astray. So this will be my last post on this matter.

Mike Lischke wrote:

>>Now you seem to mix something up. Both UTF-16 and UTF-32 are 
>>character encodings as well, just as UTF-8. All of them are 
>>converted to characters before parsing.
> 
> 
> Sure, but how is the internal representation? Actually, it is UTF-16. So although it is a transformation format it is
> also the actual character representation. Hence UTF-16 (as well as UTF-32) can be processed directly. UTF-8 has to be
> converted first to one of these formats (usually, at least). This is what I meant.

What the internal representation is, you simply do not know and there is 
also no need to know. Certainly, it is not UTF-16 as it only allows for 
64K characters which is far to little.

Oliver


 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list