[antlr-interest] Another little bug: allow \u104a0 (more than 4 after \u)

Hendrik Maryns qwizv9b02 at sneakemail.com
Fri Oct 31 03:36:27 PDT 2008


Hi all,

If one wants to specify exotic Unicode ranges, they go beyond \ufff.
Feature request: allow longer ones, such as \u104a0 (Osmanian digit 0: 𐒠).
This symbol is parsed unproblematically by javac , so you’ll have to
adapt you Java parser as well, Terence!  (At least, Eclipse doesn’t
complain, although it doesn’t render it properly.)

Probably, you want

fragment JavaIDDigit
  : '\u0030'..'\u0039'
  | '\u0660'..'\u0669'
  | '\u06f0'..'\u06f9'
  | '\u07c0'..'\u07c9'
  | '\u0966'..'\u096f'
  | '\u09e6'..'\u09ef'
  | '\u0a66'..'\u0a6f'
  | '\u0ae6'..'\u0aef'
  | '\u0b66'..'\u0b6f'
  | '\u0be6'..'\u0bef'
  | '\u0c66'..'\u0c6f'
  | '\u0ce6'..'\u0cef'
  | '\u0d66'..'\u0d6f'
  | '\u0e50'..'\u0e59'
  | '\u0ed0'..'\u0ed9'
  | '\u0f20'..'\u0f33'
  | '\u1040'..'\u1049'
  | '\u1369'..'\u1371'
  | '\u17e0'..'\u17e9'
  | '\u1810'..'\u1819'
  | '\u1946'..'\u194f'
  | '\u19d0'..'\u19d9'
  | '\u1b50'..'\u1b59'
//  | '\u104a0'..'\u104a9' osmanian, ANTLR bug!
//  | '\u10a40'..'\u10a43'
//  | '\u1d360'..'\u1d371'
  ;

and I’m leaving out the mathematical digits here, didn’t test them with
javac.  Similarly, you’ll have to expand Letter.

H.
-- 
Hendrik Maryns
http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 257 bytes
Desc: OpenPGP digital signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20081031/6581eb8a/attachment.bin 


More information about the antlr-interest mailing list