[antlr-interest] Re: Problems with Unicode support in ANTLR

micheal_jor open.zone at virgin.net
Thu May 16 14:04:30 PDT 2002


--- In antlr-interest at y..., Brian Smith <brian-l-smith at u...> wrote:
> Are the predefined Unicode blocks that are handled by 
> java.lang.Character.UnicodeBlock sufficient for what you need? Or, 
do 
> you need a different classification?

No Unicode blocks are a different concept from Unicode General 
Categories. I don't think Java's standard libraries support Unicode 
categories.

The open source ICU4J from IBM supports Unicode much better and the 
has UCharacter and UCharacterCategory classes that do the job for me.

http://oss.software.ibm.com/icu4j/

> I was thinking of patching ANTLR's Java generator to be able to use 
> named unicode character catagories as "pre-defined" "protected" 
lexer 
> rules, but supporting anything more than the Character class 
handles is 
> over my head.

Thet would a useful addition - I mean the ability to define 
such "preset" rules in ANTLR. I can do the work for Unicode 
categories once the basic framework is in place.

ter, is it OK to have ANTLR rely on additional libraries or would I 
have to somehow port the Unicode required functionality into ANTLR 
directly.

Micheal



 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list