[antlr-interest] The Classic else (Non-)Ambiguity

Ron Burk ronburk at gmail.com
Mon May 3 15:37:00 PDT 2010


> You only have 100% confidence in it because that's how you define the
> semantics for the if/else ambiguity for your language.

No, that's what all those paragraphs in the middle of the post
were refuting. :-).

> The construct
> itself is ambiguous when if/else doesn't require some kind of block
> specification (like C/C++/Java).

No, it's really not. Do you understand why Sethi, Aho, and Ullman,
when writing about this precise grammar situation, say that
the alternative choice of the (merely technical) ambiguity is
"surely wrong"? They're not saying it because it's a bad idea
to handle if/else differently; they're saying it because the alternate
choice is the nonsensical removal of "else" from the language
completely. The result is (guess I better try caps) A PARSER
THAT NO LONGER ACCEPTS INPUTS THAT ARE LEGAL IN
THE SPECIFIED GRAMMAR. That's how you get 100% confidence
that the machine can resolve the "ambiguity" automatically.
THE OTHER CHOICE IS A BROKEN PARSER. Try the
exercise I suggested with yacc (bison will work fine) if
that statement's truth is not obvious.

> Some other language could define the semantics to intentionally match
> the else to the _outermost_ if.

Let's diverge to that non sequitur for a moment. Could you post
a grammar for such a language? I would like to see how that
can be done with a CFG. I guess the rule would be that else
is illegal for any if that is inside of another if statement. Seems
like you would have to define a new non-terminal (e.g., inner_if)
that does everything an if does, but can't contain an if (only
another inner_if) and can't be followed by an else. So, the
traditional discussion of if/else ambiguity is absurd on two
levels: a) nobody ever takes that extra step to think about
what it would actually mean to bind the else to the outermost
if and b) how the else binds has *nothing* to do with source
of the (trivial, technical, meaningless) grammar ambiguity.

But of course, the
real point is not what might be happening in the infinite
universe of possible grammars, but whether this precise
grammar's alleged ambiguous choice between the null
production and the non-null production could ever possibly
produce anything *anyone* would deem sensible if the
null production choice is selected. It can't. It can only
fail to produce a parser that meets the grammar spec.

> That said, we could define a special
> "if/else-uses-standard-resolution" option,

That's already the only option. The only issue is that parser
generators keep wasting users time by claiming that there
was an alternative choice that could possibly make sense.
There isn't. When you have two adjacent instances of the
same optional non-terminal, the only choice is the
non-null production, otherwise you render a terminal
illegal at a point where it should clearly be legal,
and your parser no longer accepts the language specified
by the grammar.

It's like having a warning message that says "I was going
to insert a bug on line 34, but I decided not to,
but you should spend hours studying this in case
you feel I should have went with the bug after all."

   r monty_python_parrot_sketch.txt
   s/parrot/parser/
   s/dead/broken/
   s/sleeping/ambiguous/
   w
   q

> Why introduce an explicit option for one construct that
> not all languages need?

No new options needed at all. Just stop complaining about
"ambiguities" for which the only alternative choice cannot
possibly be correct because it produces a parser that
no longer accepts the specified grammar. The ambiguity
is a "pseudo" ambiguity, because one of the two
choices produces a broken parser. This class of spurious
warnings has wasted enormous amounts of user time
over past decades. It seems likely to do so into the future.

Won't somebody please think of the children?  :-)


More information about the antlr-interest mailing list