[antlr-interest] recursive rule invocations

Thu Apr 9 02:15:46 PDT 2009

Aaron Leiby wrote:

      Everyone's favorite error!

      I'm playing with a subset of the javascript language, but am having
      difficulties building a grammar in ANTLR for it.  I've pared down one
      of my
      core problems to the following subset:

      expression
       : Name (invocation)* // this is the simplest definition that
      exhibits the
      problem
       ;
      // an expressionStatement is an assignment or invocation - used in
      for
      loops, etc.
      expressionStatement
       : (assignmentExpression)=> assignmentExpression
       | Name ( refinement* invocation )+
       ;
      // this is giving me the problem
      assignmentExpression
       : leftHandSideExpression ( '+=' | '-=' | '=' )
        ( leftHandSideExpression '=' )* expression
       ;
      // l-vals must not end with an invocation
      leftHandSideExpression
       : Name ( invocation* refinement )*
       ;
      refinement
       : '.' Name
       ;
      invocation // normally an optional expression list
       : '(' expression ')'
       ;
       // remaining rule references are stubs similar to Name here
      Name
       : 'name'
       ;

      The error I get is:
      [fatal] rule assignmentExpression has non-LL(*) decision due to
      recursive
      rule invocations reachable from alts 1,2.  Resolve by left-factoring
      or
      using syntactic predicates or using backtrack=true option.

      There's only one alternative (which I fail to see how you can only
      have one
      if it's an alternative - why does it say "alts 1,2" in the error?)
      and it
      shows up as a little red bar before the 'expression' rule, between
      repeat
      and skip arrows.  This does not help at all.  Can someone explain how
      to
      interpret this?  All the examples of left factoring I've seen tend to
      be quite simple and obvious.  I'm finding it much more difficult to
      apply
      here.

      I'm sure I'm going about this the wrong way (starting with the
      language spec
      and working backward instead of starting with the top rule and
      decorating
      the grammar one feature at a time), but I still figured it'd be worth
      understanding how to deal with these situations when they inevitably
      pop up.

      I started with Crockford's fancy railroad diagrams, which have
      implied
      precedence.  I feel like that's been lost in the translation to ANTLR
      (or
      maybe just lost sight of).

I'm pretty new to ANTLR, so take this with a lump of salt, but here is what
I think is going on.

Suppose, while looking for an assignmentExpression, your parser came across
the following input:
   "name = name ( ... ".
It can't tell, given its LL(*) look-ahead capabilities, whether the second
name is the beginning of a leftHandSideExpression or of an expression
(because an arbitrary amount of stuff might precede an eventual "=').

I think something like the following might work:

      assignmentExpression
            :     leftHandSideExpression ( '+=' | '-=' | '=' ) lhsOrExp
            ;

      lhsOrExp
            :     (leftHandSideExpression '=')? leftHandSideExpression '='
      lhsOrExp
            |     Expression
            ;

There is probably a "best practice" for dealing with this kind of problem,
but I don't know what it is.

Good luck!

      Bowen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090409/b714a57b/attachment.html