[antlr-interest] Odd lexer rule resolution problem

shmuel siegel antlr at shmuelhome.mine.nu
Mon Sep 25 13:42:46 PDT 2006


I don't remember this problem being discussed recently but I have
encountered an unusual (for me ) problem in the antr 2.7.6 lexer generator.

Consider the simple grammar
header
{
    package error;
}

class boo extends Lexer;
options
{
k=2;
charVocabulary = '\3'..'\377'; // LATIN
}

NEWLINE :
        '\n'
        |    '\r'
    | "@@n" //ARB specific new line equivalent not in source
    ;
AT_SIGN: "@";


The "@" in NEWLINE blocks AT_SIGN. Reversing the two rules reverses the
problem. That is, which ever comes first is the one that wins because
antlr generates the following test in nextToken()
                if ((LA(1)=='\n'||LA(1)=='\r'||LA(1)=='@') && (true)) {

Interestingly enough, antlr gets it right if newline was only looking
for "@@n". Is this just another manifestation of the linear lookahead
problem?

The good news is that antlr3 gets it right.



More information about the antlr-interest mailing list