[antlr-interest] Struggling with recursion error
John B. Brodie
jbb at acm.org
Sat Dec 12 10:15:43 PST 2009
Greetings!
On Sat, 2009-12-12 at 17:22 +0100, André Næss wrote:
> I've been trying for quite some time now to put together a minimal toy
> parser, but I'm completely stuck on this recursion error that I don't
> understand.
>
> The parser I'm trying to write is a simple parser for a language that
> has only let-bindings, if-then-else, integers, variables, and function
> application. The problem is the latter. A function application
> typically looks like this:
>
> foo a b 10
>
> And I need this to parse as (foo a b 10), not (foo (a (b 10))) (I want
> to be able to pass functions as values).
>
> After some back and forth I figured that by making the topmost expr
> match either a function application or any other expression, where
> function applications all begin with ID, I should get the correct
> parse. And indeed, I do, with backtrack on. If I turn off backtrack,
> the whole thing fails with
>
> error(211): Exprs.g:28:10: [fatal] rule expr has non-LL(*) decision
> due to recursive rule invocations reachable from alts 1,2. Resolve by
> left-factoring or using syntactic predicates or using backtrack=true
> option.
It is a strange error message.
But your grammar *IS* ambiguous. See below....
> I've got the ANTLR book, and the only reference that I could find to
> this error was a trivial example involving labels that doesn't seem at
> all similar to my problem. My biggest problem is that I don't
> understand the error message at all. And the AntlrWorks debug output
> doesn't tell me much either, there's just one alternative to choose,
> and it paints one short line red. My initial intuition was that this
> was caused by the ID in atomExpr, so I tried commenting it out, but
> this has no effect. However, in the code below I've commented out the
> ID in atomExpr, focusing only on functions involving integers, to make
> it all a little simpler.
>
> What I don't get is where the recursion comes from. If an expression
> starts with ID, then this grammar says that it's a function
> application, and allows for a sequence of arbitrary expressions if,
> let or atom expressions to follow. If the expression doesn't start
> with ID, it has to be one of the others. So why on earth does the
> recursion come from? The allButFuncAppl* following ID in expr must
> either start with 'if', 'let', '(' or INT, how can this lead to
> recursion?
>
> I also wasn't able to solve this with a syntactic predicate. The only
> thing that helps is global backtracking, so while I can solve it, I'd
> really like to understand what's going on here.
>
> program
> : expr (';' expr)*
> ;
>
> expr: ID allButFuncAppl* -> ^(FUNC_APPL ID allButFuncAppl*)
> | allButFuncAppl
> ;
>
> allButFuncAppl:
> ifElseExpr
> | letExpr
> | atomExpr
> ;
>
> atomExpr
> : INT
> //| ID -> ^(VAR_REF ID)
> | '('! expr^ ')'!
> ;
>
> letExpr
> : 'let' letBinding+ (',' letBinding)* 'in' expr -> ^(LET letBinding+ expr)
> ;
(first, shouldn't that + operator be removed from the 1st occurrence of
letBinding in this letExpr rule? do you really want to permit :
let a=1 b=2 c=3 , d=5 in e
anyway....
>
> letBinding
> : (ID^ '='! expr)
> ;
>
> ifElseExpr
> : 'if' cond=expr 'then' thenPart=expr elsePart='else' expr ->
> ^(IFELSE $cond $thenPart $elsePart)
> ;
your ambiguity lies not with ID but with the allButFuncAppl (i believe).
consider the following sentence, which is legal in your grammar:
a let b=x in b 1
now to which function application does the '1' argument belong?
is it
(func_appl a (let ... (func_appl b)) 1)
or
(func_appl a (let ... (func_appl b 1)))
has to do with the arity of the a and b functions i suppose but the
parse doesn't know anything about that.
an obvious solution (but may not be acceptable to you) is to add a
required closing keyword to your letExpr and ifElseExpr rules
letExpr : 'let' ........ 'tel'
ifElseExpr : 'if' ........ 'fi'
Hope this helps...
-jbb
More information about the antlr-interest
mailing list