[antlr-interest] Additional char from LEXER->getText
Mike Lischke
mike at lischke-online.de
Fri Aug 31 00:26:35 PDT 2012
Hi Jim,
> Actually, those routines are really only there for convenience. You will
> find them too slow and and cumbersome for any complicated tasks. It is
> better to use the pointer to the input stream directly and avoid any
> copying and malloc() calls.
Well, this is what the target uses for the $text token in the grammar. If the used code is not good shouldn't the code generator then use a better one? I would like to avoid language specific stuff in my grammar where I can.
> However is this because you have a UTF8 input but are using the 8 bit
> input stream?
My setup goes like this:
input = antlr3StringStreamNew((pANTLR3_UINT8)utf8.c_str(), ANTLR3_ENC_UTF8, utf8.size(), (pANTLR3_UINT8)"sql-script");
input->setUcaseLA(input, ANTLR3_TRUE); // Make input case-insensitive. String literals must all be upper case in the grammer!
lexer = MySQL56LexerNew(input);
tokens = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT, TOKENSOURCE(lexer));
parser = MySQL56ParserNew(tokens);
MySQL56Parser_query_return ast = parser->query(parser);
Isn't that how it is supposed to work?
Mike
--
www.soft-gems.net
More information about the antlr-interest
mailing list