[antlr-interest] parsing file with token '\u0096' in it

G R relationalalgebra at gmail.com
Tue Oct 30 08:08:00 PDT 2007


You should escape your unicode char like this :

SPECIAL_CHAR:

            '\\u0096'

;


then the code generate should looks like this :

            int _type = Select;
            {
            match("\\u03c3");
            }
...


2007/10/30, timo.dufour at thomson.com <timo.dufour at thomson.com>:
>
>  I'm currently writing a parser in ANTLR3 that has to be able to parse
> files containing the character '\u0096'. I added this to a token-rule, but
> the lexer doesn't match it. When I debugged the lexer, I noticed it
> recognizes the token and represents it as '8211', which is the tokens
> ascii-code notation. But it doesn't get matched with the rule where it
> checks if it equals '\u0096'. Is there a better way to handle this token,
> for instance, not using the Unicode notation, but some other notation, so it
> matches '8211'?
>
>
>
> SPECIAL_CHAR:
>
>             '\u0096'
>
> ;
>
>
>
> parts from generated lexer code:
>
>             int LA_6 = input.LA(1);
>
>>
>             if (LA_6 == '\u0096'){
>
>>
>             }
>
>
>
> Timo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20071030/daa459a0/attachment.html 


More information about the antlr-interest mailing list