[antlr-interest] The Classic else (Non-)Ambiguity

Christopher L Conway cconway at cs.nyu.edu
Tue May 4 05:43:45 PDT 2010


Ron,

There really is an ambiguity. Given the rule
    Expr : if Expr then Expr (else Expr)? | ...
the input
    if A then if B then C else D
really can be parsed, in general, as either
    if A then (if B then C else D)
or
    if A then (if B then C) else D

What you are correct in pointing out is that only hard and fast rule
that makes sense (and hard and fast rules are generally helpful in
programming computers) is to parse the "else" eagerly. You could
imagine a parser that did not implement a hard and fast rule and
instead generated both parse trees and allowed the client to decide
which it liked best (and indeed there are parser generators that
behave that way, e.g., SGLR [1]).

It really is best to avoid these kinds of ambiguities and to
understand them if they exist in your grammar. Issuing a warning is
the right thing to do. The grammar author can take a look and say,
"Yes, this is an if-then-else-like construct and I'm OK with the eager
behavior," or, "No, this is a mistake. I'm going to rethink this
rule."

Regards,
Chris

  [1]: http://strategoxt.org/Sdf/SGLR


On Mon, May 3, 2010 at 6:37 PM, Ron Burk <ronburk at gmail.com> wrote:
>> You only have 100% confidence in it because that's how you define the
>> semantics for the if/else ambiguity for your language.
>
> No, that's what all those paragraphs in the middle of the post
> were refuting. :-).
>
>> The construct
>> itself is ambiguous when if/else doesn't require some kind of block
>> specification (like C/C++/Java).
>
> No, it's really not. Do you understand why Sethi, Aho, and Ullman,
> when writing about this precise grammar situation, say that
> the alternative choice of the (merely technical) ambiguity is
> "surely wrong"? They're not saying it because it's a bad idea
> to handle if/else differently; they're saying it because the alternate
> choice is the nonsensical removal of "else" from the language
> completely. The result is (guess I better try caps) A PARSER
> THAT NO LONGER ACCEPTS INPUTS THAT ARE LEGAL IN
> THE SPECIFIED GRAMMAR. That's how you get 100% confidence
> that the machine can resolve the "ambiguity" automatically.
> THE OTHER CHOICE IS A BROKEN PARSER. Try the
> exercise I suggested with yacc (bison will work fine) if
> that statement's truth is not obvious.
>
>> Some other language could define the semantics to intentionally match
>> the else to the _outermost_ if.
>
> Let's diverge to that non sequitur for a moment. Could you post
> a grammar for such a language? I would like to see how that
> can be done with a CFG. I guess the rule would be that else
> is illegal for any if that is inside of another if statement. Seems
> like you would have to define a new non-terminal (e.g., inner_if)
> that does everything an if does, but can't contain an if (only
> another inner_if) and can't be followed by an else. So, the
> traditional discussion of if/else ambiguity is absurd on two
> levels: a) nobody ever takes that extra step to think about
> what it would actually mean to bind the else to the outermost
> if and b) how the else binds has *nothing* to do with source
> of the (trivial, technical, meaningless) grammar ambiguity.
>
> But of course, the
> real point is not what might be happening in the infinite
> universe of possible grammars, but whether this precise
> grammar's alleged ambiguous choice between the null
> production and the non-null production could ever possibly
> produce anything *anyone* would deem sensible if the
> null production choice is selected. It can't. It can only
> fail to produce a parser that meets the grammar spec.
>
>> That said, we could define a special
>> "if/else-uses-standard-resolution" option,
>
> That's already the only option. The only issue is that parser
> generators keep wasting users time by claiming that there
> was an alternative choice that could possibly make sense.
> There isn't. When you have two adjacent instances of the
> same optional non-terminal, the only choice is the
> non-null production, otherwise you render a terminal
> illegal at a point where it should clearly be legal,
> and your parser no longer accepts the language specified
> by the grammar.
>
> It's like having a warning message that says "I was going
> to insert a bug on line 34, but I decided not to,
> but you should spend hours studying this in case
> you feel I should have went with the bug after all."
>
>   r monty_python_parrot_sketch.txt
>   s/parrot/parser/
>   s/dead/broken/
>   s/sleeping/ambiguous/
>   w
>   q
>
>> Why introduce an explicit option for one construct that
>> not all languages need?
>
> No new options needed at all. Just stop complaining about
> "ambiguities" for which the only alternative choice cannot
> possibly be correct because it produces a parser that
> no longer accepts the specified grammar. The ambiguity
> is a "pseudo" ambiguity, because one of the two
> choices produces a broken parser. This class of spurious
> warnings has wasted enormous amounts of user time
> over past decades. It seems likely to do so into the future.
>
> Won't somebody please think of the children?  :-)
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list