[antlr-interest] Working around the LALL(k) vs. LL(k) problem ?

Thu Mar 21 08:54:29 PST 2002

On Thursday, March 21, 2002, at 06:24  AM, mzukowski at bco.com wrote:

> I'm curious why it is important to you to have no syntactic ambiguity.  
> What
> kind of language are you designing?

I'll add some random thoughts on ambiguity and nondeterminism.  I need 
to think up a good description for the book anyway (not that this one 
here will be good) ;)

Technically, you want *any* computer language you design to be 
syntactically unambiguous.  Even if-then-else is unambiguous since the 
language manual says "ELSE goes with the most recent IF".  Remember that 
a language is just a valid set of sentences and however you say it is 
ok.  The trouble is when we try to formalize the language specification 
with a grammar.

If I remember properly, it's safe to say that if there is a language 
ambiguity, you will get an ambiguous grammar regardless of the parsing 
strategy (well, if it's an exact spec of the language).  For example, if 
a language says that "a()" is both a method call and a declaration then 
the language itself is ambiguous.  No grammar spec in the world will 
help your language definition.

Even if you have an unambiguous language, your grammar may not be able 
to express what you want--the grammar may be ambiguous.  Some languages 
are just too nasty to specify with simple BNF like meta-languages.  For 
example, the context sensitive languages are not compatible with our BNF 
meta-languages. [context-sensitive language say things like "only allow 
this construct if 30 lines from now you see a period character."]

As Randall points out, the parsing strategy affects the strength of the 
grammatical meta-language.  So, even if you have an unambiguous language 
with an unambiguous grammar, you might not have a (practical) parsing 
strategy that'll handle it.  The proper term to use is "deterministic" 
rather than "ambiguous" I feel when it comes to a specific flavor of 
grammar like LL(k) etc...  Languages and grammars can be ambiguous but 
parsers are nondeterminisitic, hence, you get nondeterminism warnings 
from ANTLR.  ANTLR doesn't know whether they are parsing strategy 
weaknesses, grammar ambiguities, or language ambiguities, but it does 
know that it doesn't know which path to choose at some point ;)  For 
example, IF-THEN-ELSE is nondeterministic for the parser but it's due to 
a grammar ambiguity not a strategy weakness.  Nondeterminism does not 
per force imply ambiguous grammar I don't think since one of the paths 
may lead to an invalid sentence; in that case, it's just a problem of 
the parsing strategy.

Also remember that there are an infinite number of grammars for any 
single language.  You could keep adding extra rules and such with no 
problem.  In fact, you often have to modify the grammar to suit the 
needs of your embedded actions.

Hmm...yep, that got the juices flowing.  Now lets see.  Where did I hide 
that M$ Word disk and my ANTLR book source...

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/