[antlr-interest] How to use arabic letters in my tokens ?
Ahmed Hamouda
ahmedh at horizonssoftware.com
Wed Mar 26 13:54:59 PDT 2008
Thank you for reply
>when there's a contiguous range you can specify it like so:
'\u00c0'..'\u00c7'
When I use the range, I receive a compiler error in the generated code that states no definition for the method of "MatchRange"
>And again, those don't appear to be Arabic
>characters. Run "charmap" and make sure you
>switch it to Unicode mode. You're probably
>putting in the ANSI encodings from your Arabic codepage instead.
Sorry, I don't know what is "charmap", please make me know how to get the table about the Unicode of chars ?
Thank you
Best Regards
Ahmed Hamouda (MCTS)
Software Engineer
Horizons Software
Address: 93 Haroun Al Rasheed Street, Heliopolis, Cairo, Egypt. 11351.
Tel: +202-2644-3709
Mobile: +2010-33-55-879
Fax: +202-2632-0661
Website: www.horizonssoftware.com
-----Original Message-----
From: Gavin Lambert [mailto:antlr at mirality.co.nz]
Sent: Wednesday, March 26, 2008 10:38 PM
To: Ahmed Hamouda; antlr-interest at antlr.org
Subject: Re: [antlr-interest] How to use arabic letters in my tokens ?
At 08:25 27/03/2008, Ahmed Hamouda wrote:
>I want to define a tokens as all possible
>letters that user can use
>These letters contain Arabic letters.
>I tried to add them by hand as the following 'Ç'
>| 'È' | 'Ì'.... and so, on but I received an error in the generation
Firstly, those don't appear Arabic to me; just
regular wider latin characters. Secondly, you
can't write Unicode characters directly in either
ANTLRv2 or ANTLRv3 since ANTLRv2 doesn't support
Unicode at all and ANTLRv3 still uses ANTLRv2 to
parse the grammars themselves. (ANTLRv3 grammars
can recognise Unicode characters though.)
>I also tried to use these alternatives
>
>| '\u00c2' | '\u00c3' | '\u00c4' | '\u00c5' |
>'\u00c6' | '\u00c7' | '\u00c8' | '\u00c9'
> | '\u00c0' |
> '\u00ca' | '\u00cb' | '\u00cc' | '\u00cd' |
[...]
First, when there's a contiguous range you can specify it like so:
'\u00c0'..'\u00c7'
And again, those don't appear to be Arabic
characters. Run "charmap" and make sure you
switch it to Unicode mode. You're probably
putting in the ANSI encodings from your Arabic codepage instead.
More information about the antlr-interest
mailing list