[antlr-interest] how to set lookahead in v3

Markus Kuhla bace.spam at gmx.net
Sun May 6 20:54:55 PDT 2007


Hi Johannes,

special thanks to you!!!
It seemed to work, until I tried to involve the leading blanks. Nevertheless, your idea to put the NONTERMINALS in this rule helped me a lot!

Happy evening!

> Markus Kuhla wrote:
> > Hi Johannes, Hi all,
> > 
> > What I expected from the LL(*) was that it can also decide whether to go
> out of the current rule - even if the next (two) token match an
> alternative of the current rule.
> > 
> > I give you more details, because your proposal does not work. The right
> alternative is much higher in the tree.
> > 
> > side           : (section)+  EOF;
> > section        : blanks?  (separator | textsection) NEWLINE;
> > separator      : DASH  DASH  blanks?  NEWLINE
> > textsection    : (textline_part)+;
> > textline_part  : '/*'  commentline+  ('*/')?;
> > commentline    : NEWLINE  blanks?  any_char_not_dash
> > 
> > input = 'text /* COMMENT\n  --\n NOTCOMMENT'
> > 
> > So the parser has reached the point to decide whether to continue with a
> second commentline (could fit if he considers NEWLINE blanks? only), but
> he should recognize the dashes. Then he should end the commentline ()+ loop,
> go back to section and decide that a separator is the next!
> > 
> > Do you know what I mean? I hope you can give me a good hint.
> > 
> > Thank you all for your great work here!
> > Markus
> 
> I've created the following grammar from your snippet:
> 
> side           : section+  EOF;
> section        : BLANKS?  (separator | textsection) NEWLINE;
> separator      : DASH  DASH  BLANKS?  NEWLINE;
> textsection    : textline_part+;
> textline_part  : '/*' commentline+ '*/'?;
> commentline    : NEWLINE  BLANKS?  ~(DASH | NEWLINE | BLANKS);
> 
> BLANKS: (' ' | '\t')+ ;
> NEWLINE: ('\r' '\n'?| '\n');
> DASH: '-';
> 
> Note that I turned any_char_not_dash to include no NEWLINES and BLANKS
> to remove an ambiguity. This shouldn't affect the recognition
> capabilities. Nonetheless there is still one ambiguity remaining:
> "NEWLINE BLANKS? /* */" can be matched by commentline or by two
> following section tokens. The problem is that the comment of
> textline_part has an optional '*/'. Removal of the ? clears things up,
> but changes the recognized language. The reason of this behaviour may be
> that you don't give us the entire grammar file. As I know that you can't
> do that, my advise is to look at the C# grammar specification in ECMA
> 334 standard, how they implemented the multiline comments.
> 
> Best regards,
> Johannes Luber

-- 
"Feel free" - 10 GB Mailbox, 100 FreeSMS/Monat ...
Jetzt GMX TopMail testen: http://www.gmx.net/de/go/topmail


More information about the antlr-interest mailing list