[antlr-interest] Additional char from LEXER->getText
Mike Lischke
mike at lischke-online.de
Thu Aug 30 07:57:14 PDT 2012
Hi,
there seems to be a problem in the C-target lexer, which returns an additional char in getText.
I have this lexer rule:
UNDERSCORE_CHARSET: UNDERLINE_SYMBOL LETTER_WHEN_UNQUOTED+ { $type = check_charset($text); };
For input like:
SELECT _utf8 'text'
I actually get the string "_utf8 ", which is not correct (I have the usual white space rule of course). I think either LEXER->getText itself is wrong (end pointer is one too far) or antlr38BitSubstr. Looking at the code of the latter I wonder why there's that +1. When I have a start and end pointer pointing to the same place in memory I would expect to get an empty string returned, not the single char at the start position.
I can work around this problem via pANTLR3_STRING->len - 1, but ...
Mike
--
www.soft-gems.net
More information about the antlr-interest
mailing list