[antlr-interest] Re: Overlapping tokens
David Maxwell
david at crlf.net
Tue Oct 11 14:45:03 PDT 2005
On Wed, 05 Oct 2005, David Maxwell wrote:
> In a lex/yacc example, I could do something like this:
>
> "FooBar" { printf ("Found a FOOBAR lex token\n");
> strcpy(yylval.stval,yytext);
> return FOOBAR; }
>
> [a-zA-Z_]* { printf("Found a ID lex token\n");
> strcpy(yylval.stval,yytext);
> return ID; }
Okay - so it was a bit of an RTFM (though no one even said that...)
testLiterals can do most of what I want as described above - but not
perfectly. The generated code takes the {} in the ID token and runs it
before the lookup in the literals table. As a result, the code can't
access the token type - it's not known yet.
The generated code looks like what is shown below. Is there any
construct that allows insertion of code _after_ the token type is set?
(Other than hand-editing the Lexer.cpp after every rebuild.)
void Lexer::mID(bool _createToken) {
... // match code
{ Your code here }
#line 442 "Lexer.cpp"
_ttype = testLiteralsTable(_ttype);
if ( _createToken && _token==ANTLR_USE_NAMESPACE(antlr)nullToken && _ttype!=ANTLR_USE_NAMESPACE(antlr)Token::SKIP ) {
_token = makeToken(_ttype);
_token->setText(text.substr(_begin, text.length()-_begin));
}
_returnToken = _token;
_saveIndex=0;
}
--
David Maxwell, david at vex.net|david at maxwell.net --> Unless you have a solution
when you tell them things like that, most people collapse into a gibbering,
unthinking mass. This is the same reason why you probably don't tell your
boss about everything you read on BugTraq! - Signal 11
More information about the antlr-interest
mailing list