[antlr-interest] A couple of questions regarding literals and unicode
Terence Parr
parrt at jguru.com
Fri Dec 6 13:51:32 PST 2002
On Friday, December 6, 2002, at 12:47 PM, davidjpenton2002 wrote:
> Greetings. I am struggling a little with getting literals recognized.
> I seem to have problems getting non-alphabetic characters to be
> recognized in literals. For example:
>
> class P extends Parser;
>
> startRule
> : "<?xml" SOMETHING
> ;
>
> class L extends Lexer;
> options
> {
> charVocabulary="\003'..'\377';
> }
>
> SOMETHING : "abcd";
>
> The inclusion of the non-alphabetic characters "<?" in the literal
> seems to cause problems.
The literals in the parser are tested in the lexer, but you have to
have a rule that matches those char. <? is not matched by any rule so
the lexer cannot return that token.
>
> As you might guess, I am trying to parse some xml. So this leads to a
> more general question. Does antlr handle unicode? The info on the
> website does not seem to make it clear whether it does or not.
It does and I'm thinking of making enhancements real quick before 2.7.2
comes out.
Ter
--
Co-founder, http://www.jguru.com
Creator, ANTLR Parser Generator: http://www.antlr.org
Lecturer in Comp. Sci., University of San Francisco
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list