[antlr-interest] Re: ANTLR Masquerading as SED
djcordhose
oliver at zeigermann.de
Mon Apr 28 15:21:44 PDT 2003
--- In antlr-interest at yahoogroups.com, Terence Parr <parrt at j...>
wrote:
>
> On Monday, April 28, 2003, at 02:56 AM, djcordhose wrote:
>
> > Hi all,
> >
> > I may have missed something, but it occurs to me the example
> > provided in the ANTLR docs is broken:
>
> I'm pretty sure it's ok. Note that when it fails to find
something
> that matches, it REWINDS the input and jumps to the filter rule,
> IGNORE. :)
>
> Should work for anything k>=2 :)
>
> Ter
Well, actually you are right, my example was not well chosen, sorry
for that. But consider this grammar:
class T extends Lexer;
options {
k=3;
charVocabulary = '\3'..'\177';
}
P : "<p>" {System.out.print("<P>");};
BR : "<br>" {System.out.print("<BR>");};
IGNORE
: ( "\r\n" | '\r' | '\n' )
{newline(); System.out.println("");}
| c:. {System.out.print(c);}
;
which is very similar except for not using filtering. I was just
wondering, why lookahead of k=3 does not work here on input "<b>".
Only the first two characters are checked even though I ordered
three...
Here is the genereated code fragment:
if ((LA(1)=='<') && (LA(2)=='p')) {
mP(true);
theRetToken=_returnToken;
}
else if ((LA(1)=='<') && (LA(2)=='b')) {
mBR(true);
theRetToken=_returnToken;
}
else if (((LA(1) >= '\u0003' && LA(1) <= '\u007f')) && (true)) {
mIGNORE(true);
theRetToken=_returnToken;
}
Am I still getting things wrong?
Thanks,
Oliver
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list