[antlr-interest] how to set lookahead in v3
Markus Kuhla
bace.spam at gmx.net
Sun May 6 20:54:55 PDT 2007
Hi Johannes,
special thanks to you!!!
It seemed to work, until I tried to involve the leading blanks. Nevertheless, your idea to put the NONTERMINALS in this rule helped me a lot!
Happy evening!
> Markus Kuhla wrote:
> > Hi Johannes, Hi all,
> >
> > What I expected from the LL(*) was that it can also decide whether to go
> out of the current rule - even if the next (two) token match an
> alternative of the current rule.
> >
> > I give you more details, because your proposal does not work. The right
> alternative is much higher in the tree.
> >
> > side : (section)+ EOF;
> > section : blanks? (separator | textsection) NEWLINE;
> > separator : DASH DASH blanks? NEWLINE
> > textsection : (textline_part)+;
> > textline_part : '/*' commentline+ ('*/')?;
> > commentline : NEWLINE blanks? any_char_not_dash
> >
> > input = 'text /* COMMENT\n --\n NOTCOMMENT'
> >
> > So the parser has reached the point to decide whether to continue with a
> second commentline (could fit if he considers NEWLINE blanks? only), but
> he should recognize the dashes. Then he should end the commentline ()+ loop,
> go back to section and decide that a separator is the next!
> >
> > Do you know what I mean? I hope you can give me a good hint.
> >
> > Thank you all for your great work here!
> > Markus
>
> I've created the following grammar from your snippet:
>
> side : section+ EOF;
> section : BLANKS? (separator | textsection) NEWLINE;
> separator : DASH DASH BLANKS? NEWLINE;
> textsection : textline_part+;
> textline_part : '/*' commentline+ '*/'?;
> commentline : NEWLINE BLANKS? ~(DASH | NEWLINE | BLANKS);
>
> BLANKS: (' ' | '\t')+ ;
> NEWLINE: ('\r' '\n'?| '\n');
> DASH: '-';
>
> Note that I turned any_char_not_dash to include no NEWLINES and BLANKS
> to remove an ambiguity. This shouldn't affect the recognition
> capabilities. Nonetheless there is still one ambiguity remaining:
> "NEWLINE BLANKS? /* */" can be matched by commentline or by two
> following section tokens. The problem is that the comment of
> textline_part has an optional '*/'. Removal of the ? clears things up,
> but changes the recognized language. The reason of this behaviour may be
> that you don't give us the entire grammar file. As I know that you can't
> do that, my advise is to look at the C# grammar specification in ECMA
> 334 standard, how they implemented the multiline comments.
>
> Best regards,
> Johannes Luber
--
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail
More information about the antlr-interest
mailing list