[antlr-interest] [ANTLR C 3.1.3] UCS2 input stream attempting to read beyond end of input
Andy Grove
andy.grove at codefutures.com
Mon Jun 29 14:02:27 PDT 2009
Hi Jim,
I am hitting some errors which seem to be related to the UCS2 input
stream attempting to read past the end of my UTF-16 input data. This
is the error I am seeing in one of my tests:
-memory-(1) : error 9 : Extraneous token, at offset 58
near [
: Extraneous input - expected <EOF>
Here are the input bytes:
SQLParser::parse() converted UTF-16 string (num bytes = 118) :
49 00 4e 00 53 00 45 00 52 00 54 00 20 00 49 00 4e 00 54 00 4f 00 20
00 70 00 65 00 70 00 70 00 [I.N.S.E.R.T. .I.N.T.O. .p.e.p.p.]
65 00 72 00 20 00 28 00 6e 00 61 00 6d 00 65 00 2c 00 20 00 74 00 61
00 73 00 74 00 65 00 29 00 [e.r. .(.n.a.m.e.,. .t.a.s.t.e.).]
20 00 56 00 41 00 4c 00 55 00 45 00 53 00 20 00 28 00 27 00 4a 00 61
00 6c 00 61 00 70 00 65 00 [ .V.A.L.U.E.S. .(.'.J.a.l.a.p.e.]
f1 00 6f 00 27 00 2c 00 20 00 27 00 68 00 6f 00 74 00 27 00 29
00 [..o.'.,. .'.h.o.t.'.).]
The strange thing is that functionally, the test is working as expected.
Another of my tests is actually failing with a different problem - the
generated parser is not calling the final action I have specified in
my grammar (maybe because it is not hitting the EOF character?). I am
wondering if these two issues are somehow related. I have run the code
through valgrind and there are no errors detected in terms of memory
access. Here is my code for doing the conversion from UTF-8 to UTF-16
and creating the input stream.
// input buffer
const UTF8* source = (const UTF8*) sql;
const UTF8* sourcestart = source;
const UTF8* sourceend = sourcestart + length;
// output buffer
UTF16* target = new UTF16[length + 16]; // extra chars for
safety
UTF16* targetstart = target;
UTF16* targetend = target + length;
memset(target, 0, 2*(length+16)); // fill with \0 to ensure
string is null-terminated with at least 16 nulls
// conversion
ConversionResult res = ConvertUTF8toUTF16(&sourcestart,
sourceend, &targetstart, targetend, strictConversion);
if (res != conversionOK) {
// cleanup ..
throw "failed";
}
input =
antlr3NewUCS2StringInPlaceStream((pANTLR3_UINT16)target, length, NULL);
Any suggestions would be appreciated.
Thanks,
Andy.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090629/7834e1a9/attachment.html
More information about the antlr-interest
mailing list