[antlr-interest] Difference in the following rules

Tue Jul 12 02:06:42 PDT 2005

On 7/12/05, Paul Johnson <gt54-antlr at cyconix.com> wrote:
> Ric Klaren wrote:
> > On 7/11/05, Tarun Khanna <tarunkhanna at gmail.com> wrote:
> >>The attached grammar produces a non-determinism warning in the following
> >>production
> >> factor :
> >>      ( ( LPAREN exp RPAREN ) | IDENT )  (DOT  IDENT)*  (DOT  TAB)?
> >>    |
> >>      TAB
> >>    ;
> >
> > Notice that the (DOT IDENT)* may match the empty word e.g. nothing.
> > Add to that that (DOT TAB)? may also match nothing. Add to that that
> > the start of both may be a DOT. So antlr has a hard time choosing
> > between the closure (DOT IDENT)* and the optional part (DOT TAB)?.
> 
> But what if k is 2, as Tarun said in his original post?

Not sure didn't follow the first part of the thread, didn't see the
complete grammar. It may come from surrounding rules, maybe in
combination with the alternative that only matches a TAB. Would have
to check all the lookahead sets generated with -diagnostic. When you
have parts that match nothing or complete rules that match nothing you
get really unintuitive behaviour in the lookahead sets, that's why we
have ANTLR to tell us if we might have missed something. Although
antlr is sometimes a bit overzealous. (version 3 will make life quite
a bit better...)

Anycase left factoring is usually a good solution to getting rid of
those warnings. Also
in this case you get better performance thanks to the left factoring
since antlr can decide things by looking ahead one token in stead of
two (or more).

Cheers,

Ric