[antlr-interest] predicate question

Tue May 6 01:52:08 PDT 2008

At 13:20 6/05/2008, Terence Parr wrote:
 >parser grammar T;
 >
 >a : {p1}? {action} {p2}? A
 >   | A
 >   ;
 >
 >Someone pointed out that p2 should not be tested out of context
 >(i.e., before action executes).  ANTLR v2 (PCCTS) correctly
 >ignored preds after actions.  v3 definitely walks right over
 >actions looking for preds.   I propose to prevent p2 from being
 >hoisted into the decision for rule 'a'.  Right now it generates
 >enclosed DFA.  After fix, it would only see p1.  Not sure how to 

 >fix just yet, but i'll figure it out.

I think it ought to be changed.  The above sequence is logically:
   sempred action sempred token

By definition, any time a sempred appears at the left edge of an 
alt it is a disambiguating predicate, and whenever it appears 
anywhere else it is a validating predicate.  Validating predicates 
should never be hoisted; disambiguating predicates may be.

So the real question is whether the action should be considered 
sufficient to make this no longer the left edge of the alt, and so 
therefore whether the second sempred is a disambiguating or 
validating predicate.  I think the Principle of Least Surprise 
applies here -- an action is "doing something", and the sempred 
follows that, so it should logically only be evaluated after 
having done the preceding action.  This means it must be a 
validating predicate.

If it's really desired to have it be a disambiguating predicate, 
then as others have pointed out it should be possible for the 
grammar author to merge the action into p2, and even then possibly 
merge p2 with p1.  So making this change results in a gain of 
functionality without losing any -- another good reason to do it.

Having said all this, I'm not sure my opinion should be given much 
weight, since my general practice is to avoid sempreds wherever 
possible -- if given the option to specify arbitrary code, most 
people want to use parameters and prior-matched fragments to make 
decisions with, and when they get hoisted then all bets are off.

For example, what about this case:
   rule: A {bar();} b=B ({foo($b)}? C D)? E;

The sempred here is a disambiguating predicate, but it might need 
to be hoisted in certain contexts -- and if that happens, things 
will break, since it turns out that foo() depends on something 
done by bar().

Admittedly this is a bit of a contrived example, but I'm not sure 
there is a good solution for this last one (the only idea which 
comes to mind involves backtracking and "rollback" clauses, which 
seems bad for performance).  But it is something that people will 
beat their head against from time to time.