[antlr-interest] how to set lookahead in v3
Markus Kuhla
bace.spam at gmx.net
Thu May 3 08:55:28 PDT 2007
Hi Johannes, Hi all,
> > Hi.
> >
> > I'm using antlr, v3beta7. The LL(*) algorithm determines automatically
> how much lookahead is necessary to choose a alternative?!
> >
> > I have the problem, that the parser can find 1 or 2 tokens matching to
> an alternative of a parser rule, e.g.
> > next input: "NEWLINE BLANK BLANK DASH DASH ..."
> > the parser is currently in this rule:
> >
> > text : line+;
> > line : NEWLINE? blanks? all_chars_but_not_dash
> > | EOF;
> >
> > The parser chooses the 1st alternative of line, and gives an error of
> course.
> > In the grammar is another possibility, to continue with "NEWLINE blanks?
> DASH" - So the parser has to go out of the line-rule, if there is a dash!
> >
> >
> > How can you realize a behavior like a real LL(*) parser, that there is
> enough lookahead to make a appropriate decision (leave rule line or continue
> with alternative1)
>
> The problem isn't that LL(*) is unsufficient, but that your grammar is
> ambiguous. The tokens "NEWLINE BLANK" can be either matched by one line
> token or by two line tokens. I think that the following rules don't
> exhibit this behaviour:
>
> text : (NEWLINE? blanks? line)+ EOF;
> line : all_chars_but_not_dash;
>
> Best regards,
> Johannes Luber
What I expected from the LL(*) was that it can also decide whether to go out of the current rule - even if the next (two) token match an alternative of the current rule.
I give you more details, because your proposal does not work. The right alternative is much higher in the tree.
side : (section)+ EOF;
section : blanks? (separator | textsection) NEWLINE;
separator : DASH DASH blanks? NEWLINE
textsection : (textline_part)+;
textline_part : '/*' commentline+ ('*/')?;
commentline : NEWLINE blanks? any_char_not_dash
input = 'text /* COMMENT\n --\n NOTCOMMENT'
So the parser has reached the point to decide whether to continue with a second commentline (could fit if he considers NEWLINE blanks? only), but he should recognize the dashes. Then he should end the commentline ()+ loop, go back to section and decide that a separator is the next!
Do you know what I mean? I hope you can give me a good hint.
Thank you all for your great work here!
Markus
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
More information about the antlr-interest
mailing list