[antlr-interest] parsing file with token '\u0096' in it

timo.dufour at thomson.com timo.dufour at thomson.com
Tue Oct 30 07:59:45 PDT 2007


I'm currently writing a parser in ANTLR3 that has to be able to parse
files containing the character '\u0096'. I added this to a token-rule,
but the lexer doesn't match it. When I debugged the lexer, I noticed it
recognizes the token and represents it as '8211', which is the tokens
ascii-code notation. But it doesn't get matched with the rule where it
checks if it equals '\u0096'. Is there a better way to handle this
token, for instance, not using the Unicode notation, but some other
notation, so it matches '8211'?

 

SPECIAL_CHAR:

            '\u0096'

;

 

parts from generated lexer code:

            int LA_6 = input.LA(1);

            ...

            if (LA_6 == '\u0096'){

                        ...

            }

 

Timo

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20071030/93e7b9dd/attachment.html 


More information about the antlr-interest mailing list