[antlr-interest] unicode 16bit versus new 21bit stuff

John D. Mitchell johnm-antlr at non.net
Sun Jun 20 11:37:15 PDT 2004


>>>>> "Terence" == Terence Parr <parrt at cs.usfca.edu> writes:
>>>>>> On Jun 19, 2004, at 3:36 PM, Mark Lentczner writes:
[...]

>> Seems to me that you can still encode chars and tokens in the same 32
>> bit int: any value <= 0x10FFFF is Unicode any value > 0x10FFFF is a
>> Token type

>> Or am I missing something?

> Heh, you're right.  I was focused on only 11 bits left, but if I treat it
> as a 32-bit int not 2 smaller ints, then the values work out great!.  We
> have 0x10FFFF+1 .. 0xFFFFFFFF to mess with.  That's um...lots. ;) Thanks!

Hmm... Is my senility setting in?  I thought I recalled that you had some
reason you needed them separated?

If not then rock on.

Thanks,
	John



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list