[antlr-interest] unicode 16bit versus new 21bit stuff

Mike Lischke lists at lischke-online.de
Sat Jun 19 07:03:09 PDT 2004


Hi Terence, 

> This is purely programming convenience as I know how to print 
> out a token type by it's value range.  I don't want to go to 
> 64-bit ints as most CPUs are still 32bits natively.  If I use 
> 21-bit unicode values, that leaves 2^11 or 2048 token type 
> values, which makes me a bit nervous.

I also think this is too few. I think going 64 bit (or separating character and token information into two variables)
would be the right choice if want to do it right.
 
> I want to do unicode "right" this time.  Anybody have a 
> strong opinion about the new supplemental (beyond 16bit 
> unicode) char values and/or whether 2048 is a serious token 
> type limitation?

AFAIK some of the new planes contain more mathematical characters, which could be important for extended expression
parsers.

Mike
--
www.soft-gems.net



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list