[antlr-interest] predicate question

Tue May 6 05:46:15 PDT 2008

> At 13:20 6/05/2008, Terence Parr wrote:
>  >parser grammar T;
>  >
>  >a : {p1}? {action} {p2}? A
>  >   | A
>  >   ;
>  >
>  >Someone pointed out that p2 should not be tested out of context
>  >(i.e., before action executes).  ANTLR v2 (PCCTS) correctly
>  >ignored preds after actions.  v3 definitely walks right over
>  >actions looking for preds.   I propose to prevent p2 from being
>  >hoisted into the decision for rule 'a'.  Right now it generates
>  >enclosed DFA.  After fix, it would only see p1.  Not sure how to   
> 
>  >fix just yet, but i'll figure it out.  
> 
> On Tue, 06 May 2008 20:52:08 +1200
> Gavin Lambert <antlr at mirality.co.nz> wrote:
>
> I think it ought to be changed.  The above sequence is logically:
>    sempred action sempred token
> 
> By definition, any time a sempred appears at the left edge of an 
> alt it is a disambiguating predicate, and whenever it appears 
> anywhere else it is a validating predicate.  Validating predicates 
> should never be hoisted; disambiguating predicates may be.
> 
> So the real question is whether the action should be considered 
> sufficient to make this no longer the left edge of the alt, and so 
> therefore whether the second sempred is a disambiguating or 
> validating predicate.  I think the Principle of Least Surprise 
> applies here -- an action is "doing something", and the sempred 
> follows that, so it should logically only be evaluated after 
> having done the preceding action.  This means it must be a 
> validating predicate.

That seems like a neat idea.  It might be useful to be able
to have a validating predicate before consuming a token, as this might
be at the correct line number and column position to report a compiler
error.  I guess then if the action was empty, like:

a : {p1}? {} {p2}? A
  | A
  ;

Then {p2}? would still be a validating semantic predicate.

> If it's really desired to have it be a disambiguating predicate, 
> then as others have pointed out it should be possible for the 
> grammar author to merge the action into p2, and even then possibly 
> merge p2 with p1.  So making this change results in a gain of 
> functionality without losing any -- another good reason to do it.
> 
> 
> Having said all this, I'm not sure my opinion should be given much 
> weight, since my general practice is to avoid sempreds wherever 
> possible

Which is fine if you are lucky enough to be parsing a context free
language that does not need dis-ambiguating semantic predicates.

> -- if given the option to specify arbitrary code, most 
> people want to use parameters and prior-matched fragments to make 
> decisions with, and when they get hoisted then all bets are off.

Of course I do not wish to use parameters, as you are saying that
does not work as the dis-ambiguating semantic predicates may be hoisted.

> For example, what about this case:
>    rule: A {bar();} b=B ({foo($b)}? C D)? E;
> 
> The sempred here is a disambiguating predicate, but it might need 
> to be hoisted in certain contexts -- and if that happens, things 
> will break, since it turns out that foo() depends on something 
> done by bar().

The results of the something that was done by bar() could be either:

- stored in the symbol table.

- stored in the token A, and then foo() could scan backwards
looking for A, and retrieve the result.

Likewise for foo() needing to look at B:

- An action could be introduced after consuming token B to store
information about B in the symbol table, then foo() could look
up the information about B in the symbol table.

- foo() could scan backwards looking for B.

- An action could be introduced after consuming token B to store
information about B in the symbol table, and a pointer to the
symbol table information could be stored in token B.  Then foo()
foo() could scan backwards looking for B.

Thanks, Mark

> Admittedly this is a bit of a contrived example, but I'm not sure 
> there is a good solution for this last one (the only idea which 
> comes to mind involves backtracking and "rollback" clauses, which 
> seems bad for performance).  But it is something that people will 
> beat their head against from time to time.

--