[antlr-interest] Predicate hoisting pain

Mon Apr 13 09:28:35 PDT 2009

Jim Idle wrote:
> Sam Barnett-Cormack wrote:
>> Jim Idle wrote:
>>> However, as you can obviously distinguish the cases at some point 
>>> higher up the rule chain, then if you wish to pursue this, then all 
>>> you need do is create a scope with your flag in it at a high enough 
>>> level, init it to the default case, then set/unset it as the rules 
>>> descend, then use it as the gated predicate in your rule above:
>>>
>>> highuprule
>>>     scope
>>>      { boolean os; }
>>>     @init { $highuprule::os = false; }
>>> : rule rule rule ... ;
>>>
>>> ...
>>>
>>> ruleX :  X  Y (Z { $highuprule::os = true; }  objectSetSpec)? // Z 
>>> present means flip the flag
>>> ;
>>>
>>> objectSetSpec
>>>    : {$highuprule:os}?=>additionalSetSpec
>>>    | something else
>>>    ;
>> It's more that it would have to be changed on the way down the parse 
>> tree, and changed back on the way back. So ruleX would be more like 
>> (if this would work):
>>
>> ruleX
>> @init {
>>   boolean os = $highuprule::os;
>> }
>>   : X Y ({$highuprule::os = true;} ruleZ {$highuprule::os = os;})?
>>   ;
>>
>> If that makes any sense at all.... 
> 
> Yes, but so long as your scope is high enough up then it is fine. Also, 
> don't forget that scopes stack so if you need to remember state, then 
> you coudl do it in ruleZ.

Good point. I forgot about shared scope stacks.

> Also, you should probably switch state in ruleZ and not in the caller, 
> using the @init and @after actions for ruleZ.
>> it needs to be able to change it for the duration of ruleZ (and stuff 
>> under ruleZ might change it as well for things under themselves) while 
>> changing it back to whatever it was on entering ruleX after it 
>> finishes with ruleZ. All assuming that X Y (ruleZ)? wouldn't be 
>> ambiguous, of course.
> This is why you should really merge it all into the one rule that know 
> everything, then reject the invalid constructs later, so long as they 
> are syntactically sound. Your parser is there only to verify syntactic 
> structure, not to check that it is the right structure for the right place.

The thing is, in this case "in the right place" isn't just a case of 
being the right type for what it's being assigned to, for instance - 
it's a case of there being two very different constructs that occur in 
very different places (except once, and that's a separate problem), 
inside different syntactic constructs - there's precisely one place that 
  both are allowed, and it's admittedly a pain in the but, but there's a 
few solutions I'm considering. On the other hand, trying to combine the 
two unambiguously would lead to an AST barely more useful than a flat 
token stream, and the treewalker isn't going to be in that much of a 
better position to handle semantics. The handling of the token stream 
would be left to the post-parse semantic stage, which is much less 
convenient for handling it (difference between hand-crafting parser in 
Java and using ANTLR, essentially). The only compromise I can see would 
be keeping in the specific rules and calling them as start rules on a 
synthetic token stream, once everything else is on to true semantic stuff.

Sam