[antlr-interest] parsing file with token '\u0096' in it
G R
relationalalgebra at gmail.com
Tue Oct 30 08:08:00 PDT 2007
You should escape your unicode char like this :
SPECIAL_CHAR:
'\\u0096'
;
then the code generate should looks like this :
int _type = Select;
{
match("\\u03c3");
}
...
2007/10/30, timo.dufour at thomson.com <timo.dufour at thomson.com>:
>
> I'm currently writing a parser in ANTLR3 that has to be able to parse
> files containing the character '\u0096'. I added this to a token-rule, but
> the lexer doesn't match it. When I debugged the lexer, I noticed it
> recognizes the token and represents it as '8211', which is the tokens
> ascii-code notation. But it doesn't get matched with the rule where it
> checks if it equals '\u0096'. Is there a better way to handle this token,
> for instance, not using the Unicode notation, but some other notation, so it
> matches '8211'?
>
>
>
> SPECIAL_CHAR:
>
> '\u0096'
>
> ;
>
>
>
> parts from generated lexer code:
>
> int LA_6 = input.LA(1);
>
> …
>
> if (LA_6 == '\u0096'){
>
> …
>
> }
>
>
>
> Timo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20071030/daa459a0/attachment.html
More information about the antlr-interest
mailing list