[antlr-interest] Does ANTLR exactly allow Unicode?
新买
inshua at gmail.com
Sun Oct 22 05:44:20 PDT 2006
Hi Tommy,
Thank you very much,the problem is resolved.
And by the way, in ANTLR2.7.6 Unicode cannot use double quote, must sperate
into char one by one : as '\u5f00''\u59cb'.
Thanks.
On 10/22/06, Tommy Nordgren <tommy.nordgren at chello.se> wrote:
>
>
> On 22 okt 2006, at 11.59, 新买 wrote:
>
> > I had created a simple grammar to study ANTLR. and use Chinese
> > charater as letter, and ANTLR throws no warning or error.
> > However, when I input a piece of demo stream,like below:
> >
> > 开始
> > 输出 "开始开始";
> > 结束
> >
> > it report some aweful error.
> > line 1:1: unexpected char: 0xBF
> > at LearnLexer.nextToken(LearnLexer.java:102)
> > at antlr.TokenBuffer.fill(TokenBuffer.java:69)
> > at antlr.TokenBuffer.LT(TokenBuffer.java:86)
> > at antlr.LLkParser.LT(LLkParser.java :56)
> > at LearnParser.multiWriteStatement(LearnParser.java:89)
> > at Test.main(Test.java:18)
> >
> > Trace the lexer, I found an interesting thing. the char "开" is
> > "\u5f00", but it report with 0xBF.
> > Somebody tell me how use Unicode by ANTLR exactly, thanks a lot.
> You need to set up your input (character) stream to use the correct
> encoding when
> converting from it's input (byte) stream
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20061022/b2e4e088/attachment-0001.html
More information about the antlr-interest
mailing list