[antlr-interest] unicode 16bit versus new 21bit stuff

Terence Parr parrt at cs.usfca.edu
Sat Jun 19 10:09:34 PDT 2004


On Jun 19, 2004, at 7:03 AM, Mike Lischke wrote:

> Hi Terence,
>
>> This is purely programming convenience as I know how to print
>> out a token type by it's value range.  I don't want to go to
>> 64-bit ints as most CPUs are still 32bits natively.  If I use
>> 21-bit unicode values, that leaves 2^11 or 2048 token type
>> values, which makes me a bit nervous.
>
> I also think this is too few. I think going 64 bit (or separating 
> character and token information into two variables)
> would be the right choice if want to do it right.

I'm secretly planning to allow all sorts of cool stuff like parsers 
that can handle single char tokens w/o going to the lexer and so on.  
Having the parser at runtime be able to distinguish char from token 
type just by looking at the value was going to be mighty handy.

In the previous version, I made a number of decisions based upon the 
current state of the art in CPU speed / architecture, which of course 
changes pretty damn fast.  I wonder if we shouldn't just go 64 bit for 
the token types leaving a full 32-bits for characters and for token 
types all within the same value.  Who knows, this may also make some 
other things convenient.  What if we could encode the TokenStream 
channel in the token type so we can "tune" to only certain channels?  
What if we had special tokens other than EOF?  Perhaps there will be a 
need.

Hmm...I wonder how fast 64-bit processors will become the norm (G5s are 
there and AMD is too, right?)?  How horrible does Java do 64-bit ints 
now for comparison and other rot?  ANTLR 3.0 won't be available for a 
while...perhaps 64 bits ain't that bad an idea.  BTW, passing around 
another variable is unwieldly coding wise and will just be two 32-bit 
ints anyway, right?

Too bad I don't have #define or a typename I could use so the actual 
type could be changed later.  Would be nice to see LabelType instead of 
int.

Thanks for the ideas...

Ter
--
CS Professor & Grad Director, University of San Francisco
Creator, ANTLR Parser Generator, http://www.antlr.org
Cofounder, http://www.jguru.com
Cofounder, http://www.knowspam.net enjoy email again!
Cofounder, http://www.peerscope.com pure link sharing





 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list