[antlr-interest] Re: Problems with Unicode support in ANTLR
micheal_jor
open.zone at virgin.net
Thu May 16 18:29:20 PDT 2002
> Okay, I see what you are talking about. Java's Character class does
have
> support for some catagories; see
> http://java.sun.com/j2se/1.4/docs/api/java/lang/Character.html
>
> Please look at the listed catagories and let me know if it is too
> limited. In particular, java.lang.Character.getType(), and the
static
> final catagory constants.
I saw the static constants but could see that they were used
anywhere. Not surprisingly, I don't believe someone actually
thought "getType()" makes sense as the accessor for a character's
Unicode General Category -- what happened to getCategory() or
getGeneralCategory()?. Sheez!
In any case, you are right that the feature is supported.
> I would rather not have my Unicode-parsing application depend on
IBM's
> library since I would have to distribute it. I think that the
> java.lang.Character class's support is sufficient.
For the feature we've discussed fo far, yes it is. The license for
IBM's package doesn't forbid extracting what we need into ANTLR if
memory serves.
> Presumably, the modified ANTLR would then generate code like this:
> int type = Character.getType(LA(1));
> switch (type) {
> case Character.END_PUNCTUATION:
> mRULE(true);
> theRetToken=_returnToken;
> break;
> ....
> }
>
Erm....Terrence are you there? ;-)
Micheal
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list