[antlr-interest] Re: Problems with Unicode support in ANTLR

Thu May 16 22:59:19 PDT 2002

--- In antlr-interest at y..., "Matthew Ford" <Matthew.Ford at f...> wrote:
> This approach would not work for me as I need
> 
> IDENT
>  options {testLiterals=true;
>      paraphrase = "an identifier";}
>  : ('a'..'z'|'_'|'$'|'\u0080'..'\uFFFE')
> ('a'..'z'|'_'|'0'..'9'|'$'|'\u0080'..'\uFFFE')*
>  ;
> 
> So rather then sub-blocks, what I need is an efficient compression 
method to
> store these bitsets in the Antlr.

The \00800..\uFFFE range might be overkill as many characters in that 
range would not [normally] be usable as parts of an IDENT.

You are right that more efficient BitSet representation are needed 
for ANTLR's Unicode support in general.

Micheal

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/