[antlr-interest] how to set lookahead in v3

Markus Kuhla bace.spam at gmx.net
Thu May 3 08:55:28 PDT 2007


Hi Johannes, Hi all,

> > Hi.
> > 
> > I'm using antlr, v3beta7. The LL(*) algorithm determines automatically
> how much lookahead is necessary to choose a alternative?!
> > 
> > I have the problem, that the parser can find 1 or 2 tokens matching to
> an alternative of a parser rule, e.g.
> > next input: "NEWLINE BLANK BLANK DASH DASH ..."
> > the parser is currently in this rule:
> > 
> > text : line+;
> > line : NEWLINE? blanks? all_chars_but_not_dash
> >      | EOF;
> > 
> > The parser chooses the 1st alternative of line, and gives an error of
> course.
> > In the grammar is another possibility, to continue with "NEWLINE blanks?
> DASH" - So the parser has to go out of the line-rule, if there is a dash!
> > 
> > 
> > How can you realize a behavior like a real LL(*) parser, that there is
> enough lookahead to make a appropriate decision (leave rule line or continue
> with alternative1)
> 
> The problem isn't that LL(*) is unsufficient, but that your grammar is
> ambiguous. The tokens "NEWLINE BLANK" can be either matched by one line
> token or by two line tokens. I think that the following rules don't
> exhibit this behaviour:
> 
> text : (NEWLINE? blanks? line)+ EOF;
> line : all_chars_but_not_dash;
> 
> Best regards,
> Johannes Luber


What I expected from the LL(*) was that it can also decide whether to go out of the current rule - even if the next (two) token match an alternative of the current rule.

I give you more details, because your proposal does not work. The right alternative is much higher in the tree.

side           : (section)+  EOF;
section        : blanks?  (separator | textsection) NEWLINE;
separator      : DASH  DASH  blanks?  NEWLINE
textsection    : (textline_part)+;
textline_part  : '/*'  commentline+  ('*/')?;
commentline    : NEWLINE  blanks?  any_char_not_dash

input = 'text /* COMMENT\n  --\n NOTCOMMENT'

So the parser has reached the point to decide whether to continue with a second commentline (could fit if he considers NEWLINE blanks? only), but he should recognize the dashes. Then he should end the commentline ()+ loop, go back to section and decide that a separator is the next!

Do you know what I mean? I hope you can give me a good hint.

Thank you all for your great work here!
Markus
-- 
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail


More information about the antlr-interest mailing list