[antlr-interest] Odd lexer rule resolution problem
shmuel siegel
antlr at shmuelhome.mine.nu
Mon Sep 25 13:42:46 PDT 2006
I don't remember this problem being discussed recently but I have
encountered an unusual (for me ) problem in the antr 2.7.6 lexer generator.
Consider the simple grammar
header
{
package error;
}
class boo extends Lexer;
options
{
k=2;
charVocabulary = '\3'..'\377'; // LATIN
}
NEWLINE :
'\n'
| '\r'
| "@@n" //ARB specific new line equivalent not in source
;
AT_SIGN: "@";
The "@" in NEWLINE blocks AT_SIGN. Reversing the two rules reverses the
problem. That is, which ever comes first is the one that wins because
antlr generates the following test in nextToken()
if ((LA(1)=='\n'||LA(1)=='\r'||LA(1)=='@') && (true)) {
Interestingly enough, antlr gets it right if newline was only looking
for "@@n". Is this just another manifestation of the linear lookahead
problem?
The good news is that antlr3 gets it right.
More information about the antlr-interest
mailing list