[antlr-interest] Re: unicode support
    micheal_jor <open.zone at virgin.net> 
    open.zone at virgin.net
       
    Tue Dec 17 07:45:58 PST 2002
    
    
  
--- In antlr-interest at yahoogroups.com, Pete Forman <pete.forman at w...> 
wrote:
> At 2002-12-16 14:51 -0800, Terence Parr wrote:
> >I can convert a table to Java with a shell script probably if we 
can
> >find a convenient table.
> 
> http://www.unicode.org/Public/UNIDATA/ReadMe.txt
> 
> That is for the current version, i.e. Unicode 3.2.  You might wish 
to
> stick at version 3.0 which is the last 16 bit version.  Current
> Unicode uses 21 bits but Java does not grok it.
What worked for me in the past:
I imported the http://www.unicode.org/Public/3.1-Update/UnicodeData-
3.1.0.txt text file into a database and wrote simple queries to dump 
the list of char-values and char-ranges for each UnicodeCategory. I 
used MS SQL Server and MS Access as a prototyping-friendly front end 
to write all the queries/formatting code.
In any case, this strategy should work with other RDBMSes as long as 
what you want is the char-values and char-ranges of the 
UnicodeCategory-ies. Otherwise Character.getType(char ch) should tell 
you what UnicodeCategory a given char belongs to.
Micheal
 
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 
    
    
More information about the antlr-interest
mailing list