[antlr-interest] Hex or Unicode char definition?

Jim Idle jimi at temporal-wave.com
Tue Sep 15 11:34:57 PDT 2009


You have to specify the code point ranges explicitly. The range just  
uses the range defined according the standard Unicode code points.  
There is no significance in 'a'..'z' other than the code points for a  
and z. Use '\unnnn' for wierd and wonderful code points.

Jim

On Sep 15, 2009, at 9:51 AM, Andreas Volz <lists at brachttal.net> wrote:

> Am Sun, 13 Sep 2009 23:38:12 +0200 schrieb Andreas Volz:
>
>> Hello,
>>
>> how is it possible to define chars as hex or unicode chars. e.g.:
>>
>> fragment ALPHA
>>    : 'a'..'z' | 'A'..'Z' | '.' | ',' | ' ' | '@'
>>
>> I have a syntax in rfc2234[1] format and there hex chars are defined.
>> How should I transfer this to ANTLR? e.g.:
>>
>>   CHAR         = %x01-7F
>>        ; Any C0 Controls and Basic Latin, excluding NULL from
>>        ; Code Charts, pages 7-6 through 7-9 in [UNICODE]
>>
>> regards
>> Andreas
>>
>> [1] http://tools.ietf.org/html/rfc2234
>
> Could nobody answer this question? I also asked myself where is  
> defined
> which chars are really between 'A'..'Z'. Does this depend on the
> locale? For locale char detection I would expect a solution to define
> unicode chars for a parser rule.
>
> regards
> Andreas
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address


More information about the antlr-interest mailing list