[antlr-interest] Lookahead and wildcards (was: ANTLR Masquerading as SED)

Oliver Zeigermann oliver at zeigermann.de
Tue Apr 29 00:02:46 PDT 2003


OK, maybe I am dump, but I still do not get it. This is the real 
core of my question: 

Why does this grammar


class T extends Lexer;
options {
  k=3;
  charVocabulary = '\3'..'\177';
}
P  : "<p>" ;
BR : "<br>" ;

IGNORE : . ;


result to this generated code


if ((LA(1)=='<') && (LA(2)=='p')) {
    mP(true);
    theRetToken=_returnToken;
}
else if ((LA(1)=='<') && (LA(2)=='b')) {
    mBR(true);
    theRetToken=_returnToken;
}
else if (((LA(1) >= '\u0003' && LA(1) <= '\u007f')) && (true)) {
    mIGNORE(true);
    theRetToken=_returnToken;
}


I would have expected this code (because of k=3)


if ((LA(1)=='<') && (LA(2)=='p') && (LA(3)=='>')) {
    mP(true);
    theRetToken=_returnToken;
}
else if ((LA(1)=='<') && (LA(2)=='b') && (LA(3)=='r')) {
    mBR(true);
    theRetToken=_returnToken;
}
else if (((LA(1) >= '\u0003' && LA(1) <= '\u007f')) && (true)) {
    mIGNORE(true);
    theRetToken=_returnToken;
}


Thanks,

Oliver



 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list