[antlr-interest] recursive rule invocations
Bowen Alpern
alpernb at us.ibm.com
Thu Apr 9 02:15:46 PDT 2009
Aaron Leiby wrote:
Everyone's favorite error!
I'm playing with a subset of the javascript language, but am having
difficulties building a grammar in ANTLR for it. I've pared down one
of my
core problems to the following subset:
expression
: Name (invocation)* // this is the simplest definition that
exhibits the
problem
;
// an expressionStatement is an assignment or invocation - used in
for
loops, etc.
expressionStatement
: (assignmentExpression)=> assignmentExpression
| Name ( refinement* invocation )+
;
// this is giving me the problem
assignmentExpression
: leftHandSideExpression ( '+=' | '-=' | '=' )
( leftHandSideExpression '=' )* expression
;
// l-vals must not end with an invocation
leftHandSideExpression
: Name ( invocation* refinement )*
;
refinement
: '.' Name
;
invocation // normally an optional expression list
: '(' expression ')'
;
// remaining rule references are stubs similar to Name here
Name
: 'name'
;
The error I get is:
[fatal] rule assignmentExpression has non-LL(*) decision due to
recursive
rule invocations reachable from alts 1,2. Resolve by left-factoring
or
using syntactic predicates or using backtrack=true option.
There's only one alternative (which I fail to see how you can only
have one
if it's an alternative - why does it say "alts 1,2" in the error?)
and it
shows up as a little red bar before the 'expression' rule, between
repeat
and skip arrows. This does not help at all. Can someone explain how
to
interpret this? All the examples of left factoring I've seen tend to
be quite simple and obvious. I'm finding it much more difficult to
apply
here.
I'm sure I'm going about this the wrong way (starting with the
language spec
and working backward instead of starting with the top rule and
decorating
the grammar one feature at a time), but I still figured it'd be worth
understanding how to deal with these situations when they inevitably
pop up.
I started with Crockford's fancy railroad diagrams, which have
implied
precedence. I feel like that's been lost in the translation to ANTLR
(or
maybe just lost sight of).
I'm pretty new to ANTLR, so take this with a lump of salt, but here is what
I think is going on.
Suppose, while looking for an assignmentExpression, your parser came across
the following input:
"name = name ( ... ".
It can't tell, given its LL(*) look-ahead capabilities, whether the second
name is the beginning of a leftHandSideExpression or of an expression
(because an arbitrary amount of stuff might precede an eventual "=').
I think something like the following might work:
assignmentExpression
: leftHandSideExpression ( '+=' | '-=' | '=' ) lhsOrExp
;
lhsOrExp
: (leftHandSideExpression '=')? leftHandSideExpression '='
lhsOrExp
| Expression
;
There is probably a "best practice" for dealing with this kind of problem,
but I don't know what it is.
Good luck!
Bowen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090409/b714a57b/attachment.html
More information about the antlr-interest
mailing list