[antlr-interest] [C Target][3.1.1] Trying to understand the behavior of rules with kleene stars

Sven Van Echelpoel sven.van.echelpoel at empolis.com
Mon May 11 07:30:29 PDT 2009


Hi,

I'm having trouble understanding the behavior of the parser w.r.t.
invalid token in rules with Kleene star elements. I have this grammar
that says that a translation unit is zero or more rules, declarations,
etc. e.g.

translation_unit
  : ( declaration | rule )* ';'
    -> ^( UNIT rule* )     // only care about rules
  ;

Now, if a rule is followed after the semi colon by an token that is
illegal at that position, no more rules are processed. No error is
generated. Looking at the generated code, you get something like this:

for (;;)
{
  int alt2=2;
  {
    int LA2_0 = LA(1);
    if ( LA2_0 == /*some tokens expected at this position*/  )  // (1)
    {
      alt2=1;
    }


  }
  switch (alt2) 
  {
  case 1:
    /* Continue here if this was what was expected */
    break;
  default:
    goto loop2;	/* break out of the loop */                    //(2)
    break;
  }
}
loop2: ; /* Jump out to here if this rule does not match */    //(3)

In (1) the look ahead token is checked against a set of expected tokens.
There can be multiple else if branches following this too. If the token
is unexpected, the value of alt2 remains 2 and in the subsequent switch
the default case (2) is taken. This simply breaks out of the loop. After
the loop2 label processing continues as if nothing has happened (3). In
our example above, AST rewrite rules are invoked.

Note that this pattern is consistently applied every time a Kleene star
is used somewhere in a rule. If a token is unexpected at that position,
processing just stops and no error is raised. It seems to me that the
code is a bit too liberal in interpreting the zero of zero-or-more :-) ,
i.e. even zero times something expected is fine, erroneously discounting
the stuff that is unexpected. Am I right, or am I missing something?

Apologies if this is a real issue and it has already been fixed after
3.1.1. I found nothing in the bug db and have currently no time to
investigate this is a later release.

Sven





More information about the antlr-interest mailing list