[antlr-interest] lexing multiple literals to one token
Don Caton
dcaton at shorelinesoftware.com
Sat Jul 30 06:19:52 PDT 2005
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Robert Anderson
> Sent: Wednesday, July 27, 2005 9:54 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] lexing multiple literals to one token
>
> I can't quite figure out the syntax for this:
>
> I want to lex two different (interchangeable) keywords into
> the same token. I want to use the tokens {..} mechanism
> because I want both of these to be considered by a
> testLiterals=true identifier rule option.
> How do I do this? The following don't seem to work:
>
> tokens {
> MYTOK="form1";
> MYTOK="form2";
> }
Easy, just override testLiteralsTable() in the parser. At least, it's easy
if you're using C++, I don't know if you can do it in Java. Put this at the
top of your lexer.g, after the tokens block, and it will be copied into the
generated lexer code:
{
int testLiteralsTable( int ttype ) const
{
if ( _tcsicmp( text.c_str(), _T( "form2" ) == 0 )
{
ttype = MYTOK;
}
else
{
ttype = __super:testLiteralsTable( ttype );
}
return type;
}
}
If you need a case sensitive comparison, use _tcscmp() instead, or just use
the '==' operator on the std::string class. 'text' is a class var in the
lexer that contains the text of the current token being processed.
Don
More information about the antlr-interest
mailing list