[antlr-interest] trouble with ids and keywords

Bob Marinier bob.marinier at soartech.com
Fri Feb 6 14:07:01 PST 2009


Hi,

I'm using antlr 2.7.6 and I have a problem with keywords and identifiers 
conflicting. Specifically, if I have an identifier that starts with a 
keyword, then the beginning gets picked up as a keyword, as opposed to 
the whole thing getting recognized as an identifier. For example, one of 
my keywords is "new". If the input contains "newX", then this gets 
tokenized as the "new" keyword and an identifier "X", whereas I want 
just an identifier "newX". That is, I want the identifier rule to be 
greedy, and only check the literals table after it's read as much as it can.

Here's the part of the lexer that I think is relevant:

class MyLexer extends Lexer;

options
{
    importVocab=Hlsr;
    charVocabulary = '\0'..'\377';
    testLiterals=false;    // don't automatically test for literals
    k=11;
}

NEW_EXPRESSION: "new";

IDENT
options
{
    testLiterals=true;
}
:
    ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
;

How can I fix the issue?
Thanks,
Bob


More information about the antlr-interest mailing list