[antlr-interest] problem with unicode characters in comments within ANTLR .g files ...
Gavin Lambert
antlr at mirality.co.nz
Sat May 24 05:13:41 PDT 2008
At 02:19 24/05/2008, Raymer David-fdr017 wrote:
>This fragment generates an exception ...
>
>O_SQUOTE : '\u2018'; // '
>C_SQUOTE : '\u2019'; // '
>DQUOTE : '\"';
>O_DQUOTE : '\u201C'; // "
>C_DQUOTE : '\u201D'; // "
[...]
>The problem appears to the be the non-\ encoded unicode
>characters. Is this behavior expected?
ANTLR grammars can't contain any Unicode characters, since the
grammar itself is still parsed with ANTLR v2, and v2 can't cope
with Unicode. (Lexers generated by ANTLR v3 can recognise Unicode
just fine though, but you need to escape them within the grammar.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080525/cc0da7c4/attachment.html
More information about the antlr-interest
mailing list