[antlr-interest] Re: ANTLR Masquerading as SED

djcordhose oliver at zeigermann.de
Mon Apr 28 15:21:44 PDT 2003


--- In antlr-interest at yahoogroups.com, Terence Parr <parrt at j...> 
wrote:
> 
> On Monday, April 28, 2003, at 02:56  AM, djcordhose wrote:
> 
> > Hi all,
> >
> > I may have missed something, but it occurs to me the example
> > provided in the ANTLR docs is broken:
> 
> I'm pretty sure it's ok.  Note that when it fails to find 
something 
> that matches, it REWINDS the input and jumps to the filter rule, 
> IGNORE. :)
> 
> Should work for anything k>=2 :)
> 
> Ter

Well, actually you are right, my example was not well chosen, sorry 
for that. But consider this grammar:

class T extends Lexer;
options {
  k=3;
  charVocabulary = '\3'..'\177';
}
P  : "<p>" {System.out.print("<P>");};
BR : "<br>" {System.out.print("<BR>");};

IGNORE
  :  ( "\r\n" | '\r' | '\n' )
     {newline(); System.out.println("");}
  |  c:. {System.out.print(c);}
  ;


which is very similar except for not using filtering. I was just 
wondering, why lookahead of k=3 does not work here on input "<b>". 
Only the first two characters are checked even though I ordered 
three...
Here is the genereated code fragment:

if ((LA(1)=='<') && (LA(2)=='p')) {
    mP(true);
    theRetToken=_returnToken;
}
else if ((LA(1)=='<') && (LA(2)=='b')) {
    mBR(true);
    theRetToken=_returnToken;
}
else if (((LA(1) >= '\u0003' && LA(1) <= '\u007f')) && (true)) {
    mIGNORE(true);
    theRetToken=_returnToken;
}

Am I still getting things wrong?

Thanks,

Oliver




 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list