[antlr-interest] labels on chars

Terence Parr parrt at cs.usfca.edu
Fri Nov 3 15:05:17 PST 2006


On Nov 3, 2006, at 3:00 PM, Kay Roepke wrote:

>
> On 3. Nov 2006, at 23:53 , Terence Parr wrote:
>
>> Hi, in the lexer for v3 when you reference a char, should it be of  
>> type int or char?
>>
>> R : a='c' {char foo = $a;} ;
>>
>> Should $a be an int or char?  Char is more convenient and probably  
>> correct.  Any issues?  EOF perhaps?
>
> char is unsigned in java, isn't it? what happens if you get EOF (=  
> -1)? Does it wrap around to 0xFFFF?
> Having EOF==-1 bothered me in ObjC since I couldn't use enums to  
> hold tokentypes because of the -1 :(
> Other than that char should be ok, it covers the whole UTF-16  
> range, right?

Yep, 16 bits.  -1 would go to 0xFFFF, an invalid UNICODE char I  
think.  Hmm...  Well, I'll leave as int for now and see if anyone  
complains. ;)

Ter


More information about the antlr-interest mailing list