[antlr-interest] Working around the LALL(k) vs. LL(k) problem ?

Wed Mar 20 17:17:38 PST 2002

On Thu, Mar 21, 2002 at 10:35:58AM +1100, Matthew Ford wrote:
> It depends what you mean by "ambiguous".
> I assume you mean for some given k lookahead the grammar is not "ambiguous".
> But you still have to decide what value of k you need and this may change as
> your grammar evolves.

Actually, I don't want to reference k at all for this purpose.
Syntactic ambiguity of a grammar is independent of method used to
parse the language.  (I'm specifically referring to context-free
grammars; e.g., EBNF.)  A grammar is unambiguous if each string has
only a single corresponding parse tree (aka AST).  (Note that a
grammar might require infinite look-ahead and still be unambiguous.)
If a given string in the language can be parsed into more than one
valid tree, then the grammar is ambiguous.  The infamous
"dangling-else" problem is an example of syntactic ambiguity.  I want
to make sure I avoid such problems in my grammar.  (Of course,
imposing rules that say that an "else" attaches to the nearest "if" is
used to eliminate the ambiguity in practice, but the grammar itself
does not encode these sorts of rules, and is therefore still
ambiguous.)

As another example, the grammars of natural human languages are
typically overflowing with ambiguity.  We determine what the intended
meaning was from context and "common sense".  There are constructed
human languages (e.g., Lojban, http://www.lojban.org) which have
unambiguous grammars.  (I believe that Lojban may require infinite
lookahead, but it is unambiguous, and I believe an EBNF grammar can be
downloaded from somewhere on the above website.  Tools for parsing it
have been written; I'm not sure whether the parsers are top-down or
bottom-up.  It might be interesting to write an ANTLR grammar for it.)

I suspect that ANTLR, for all its power in generating parsers, is
actually not quite the right tool for verifying that a given grammar
is unambiguous.  Because it uses linear approximation, it can see
ambiguity where there is none.  If I'm not mistaken, using predicates
to work around this problem might cause ANTLR to not complain about an
actual ambiguity.

Unfortunately, I don't know of any tools which will validate a
grammar's unambiguity without being tied to a particular parsing
method.  I'm not even sure if such a thing is possible, or
computationally practical.  I do know that grammars can be proven
unambiguous by hand, through formal mathematical proof, but that's
rather time-consuming except for simple grammars.  If anybody knows of
methods for automatic validation, I'd love to hear about them.

Let me reiterate, before I make myself unwelcome on this list, that
I'm not criticizing ANTLR in the slightest.  I think it's great at
what it does.  I'm just wondering if it's great at what I want it to
do -- the right tool for the job, and all that.

Randall Nortman

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/