[antlr-interest] MSVC 7.0
Ric Klaren
klaren at cs.utwente.nl
Fri Oct 3 02:30:48 PDT 2003
Hi,
On Thu, Oct 02, 2003 at 05:18:04PM -0000, Arnar Birgisson wrote:
> STADVAER : "staðvær";
>
> I recall that the ANTLR documentation states that the inputCharset for
> it's metalanguage is 7-bit ascii, so according to that, ANTLR should
> have yielded an error for this rule. However, this was translated
> directly to a string constant in the C++ file. (Note: this works fine in
> Java)
Think it was upped to 8 bit at a point. The documentation is crappy though.
> Then, somewhere along the way, the expected character becomes an int,
> and should be 0x000000f0, but is generated as 0xfffffff0. When 0xf0 is
> seen on the input, this causes a MismatchedCharException and it tries to
> generate it's message, it calls charName for 0xfffffff0 (the expected
> char), which in turn calls isprint and since 0xfffffff0 is negative, it
> blows up.
Hmm I had expected a problem like this one of these days. The runtime uses
a lot of int's where it should be using unsigneds just to prevent these
signextension troubles. I started changing some of them already only this
part I did not touch yet.
> I guess this is partly my fault since I didn't follow ANTLR's
> documentation carefully enough. Changing the rule to
>
> STADVAER : "sta\360v\346r";
>
> seems to fix this (it's butt-ugly though :o). However, I would like to
> point out that this worked in Java, with it's multibyte string
> constants, and along they way, antlr.Tool never complained about it's
> input.
antlr.Tool often does not complain where it should ;) To get back to the
point I thought I had fixed all these quoting stuffs in C++ codegen. Seems
I'll have to reinvestigate.
> I'm sending this in here mostly to be of reference to others lexing
> non-7bit ascii data with a c++ lexer, in case the hit the same walls I
> did.
8 bit should work fine except for some caveats. I will look at this.
Multibyte encodings is a no go though in C++.
Cheers,
Ric
--
-----+++++*****************************************************+++++++++-------
---- Ric Klaren ----- j.klaren at utwente.nl ----- +31 53 4893722 ----
-----+++++*****************************************************+++++++++-------
"Never argue with an idiot, for they will bring you down to their
level and beat you with experience." --- Unknown
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list