[antlr-interest] Additional char from LEXER->getText

Jim Idle jimi at temporal-wave.com
Fri Aug 31 08:53:30 PDT 2012


I added support for $text because so many examples had it. It is fine if you are not doing anything that is performance or memory sensitive. 

I think that the goal of non language specific action isn't that realistic and will get in the way of doing things correctly. 

Also note that the case insensitive option only works on ASCII and not the full Unicode code point set. 

Jim

On Aug 31, 2012, at 12:26 AM, Mike Lischke <mike at lischke-online.de> wrote:

> 
> Hi Jim,
> 
>> Actually, those routines are really only there for convenience. You will
>> find them too slow and and cumbersome for any complicated tasks. It is
>> better to use the pointer to the input stream directly and avoid any
>> copying and malloc() calls.
> 
> Well, this is what the target uses for the $text token in the grammar. If the used code is not good shouldn't the code generator then use a better one? I would like to avoid language specific stuff in my grammar where I can.
> 
>> However is this because you have a UTF8 input but are using the 8 bit
>> input stream?
> 
> 
> 
> My setup goes like this:
> 
>  input = antlr3StringStreamNew((pANTLR3_UINT8)utf8.c_str(), ANTLR3_ENC_UTF8, utf8.size(), (pANTLR3_UINT8)"sql-script");
>  input->setUcaseLA(input, ANTLR3_TRUE); // Make input case-insensitive. String literals must all be upper case in the grammer!
> 
>  lexer = MySQL56LexerNew(input);
>  tokens = antlr3CommonTokenStreamSourceNew(ANTLR3_SIZE_HINT, TOKENSOURCE(lexer));
>  parser = MySQL56ParserNew(tokens);
> 
>  MySQL56Parser_query_return ast = parser->query(parser);
> 
> Isn't that how it is supposed to work?
> 
> Mike
> -- 
> www.soft-gems.net
> 
> 


More information about the antlr-interest mailing list