[antlr-interest] need help with predicates

Thomas Brandon tbrandonau at gmail.com
Thu Aug 9 18:28:09 PDT 2007


On 8/10/07, Andy Tripp <antlr at jazillian.com> wrote:
>
>  I'm trying Tom's suggestion to use a "global scope".
>  I define my scope outside of all rules like this:
>
>  scope identifierBang {
>      boolean allow;
>  }
>
>  Inside my identifier rule, I make sure never to consume a '!' char unless
> this flag is true:
>  identifier
>  scope identifierBang;
>      : {! $identifierBang::allow}? => '['? ID_FRAGMENT ']'?
>      | '['? ID_FRAGMENT BANG? ']'?
>      ;
I think you shouldn't declare the use of the scope here. I believe
that makes ANTLR push a new copy of the scope onto the stack. So here
where you just want to use it you shouldn't declare it.

Tom.
>
>  In my rule where I'm having the problem, I say "try it first without
> consuming the '!', and then
>  again normally" (global backtracking option is turned on):
>
>  dotOpExpression
>  @after {
>      $identifierBang::allow = true;
>  }
>  scope identifierBang;
>      : {$identifierBang::allow=false;} unaryOps dot2*
>      | {$identifierBang::allow=true;} unaryOps dot2*
>      ;
>
>  ...and that produces EmptyStackExceptions, so it appears that I have to put
> something on the stack initially:
>
>  compilationUnit
>  scope identifierBang;
>  @init {
>      identifierBang::allow = true;
>  }
>      : ...and so on
>      ;
>
>  And what happens? As soon as it encounters "Dim A!", I get a "mismatched
> input" exception.
>
>  So, I'll take a fresh look at it tomorrow.
>  Seems like maybe just passing args down might be the only way to go.
>  The scopes ideas seems good, but, as always, I'm left scratching my head
> when things don't work as expected.
>  And the code generated by ANTLR is looking more like YACC output as the
> years go by and I get older
>  and dumber :)
>
>  Thanks for the help,
>  Andy
>
>  p.s. you're right, dynamic scopes wont work, as there are ways to get to
> identifer other than
>  through this particular rule.
>
>
>  Thomas Brandon wrote:
>  On 8/10/07, Andy Tripp <antlr at jazillian.com> wrote:
>
>
>  The language I'm parsing, visual basic, lets an identifier have a '!'
> suffix:
>
> Identifier:
>  '['? LETTER (LETTER| DECIMAL_LITERAL)* ('%'|'#'|'$'|'&'|'!')? ']'?
>  ;
>
> But it also lets you use '!' as a "separator" the way C/C++/Java/etc.
> use '.'
> In the midst of a hierarchy of rules dealing with expressions, I have
> this rule:
>
> dotOpExpression:
>  unaryOps (
>  DOT^ dotOperand?
>  | BANG^ anyName?
>  )*
>  ;
>
> Here, the unaryOps, dotOperand, and anyName rules all eventually refer
> to Identifier.
> So the problem is that during the dotOpExpression processing, the
> unaryOps consumes
> the Identifier, including the '!'. So in trying to match "a!b", it
> fails, because it took "a!"
> as the Identifier and couldn't match the rest.
>
> So one solution is to take the '!' out of the Identifier rule, perhaps
> now calling it IdentifierNoBang,
> and then have alternative versions of other rules (unaryOpsNoBang,
> dotOperandNoBang, anyNameNoBang, etc).
> But that would be a huge mess.
>
> It seems like a syntactic predicate with "backtrack=true" should work
> here, but I can't quite see how.
> I want to say, in dotOpExpression, "try to match this pattern, but if
> that doesn't work, try again, but this
> time don't allow a '!' at the end of unaryOps". I can't see how to do
> that without all that rework to
> remove the '!' from Identifier.
>
>  Syntactic predicates only help ANTLR decide between alternatives, so
> you still need to be able to specify the alternates as standard rules.
> So you need some way to a specify an identifier with or without bang.
> Apart from duplicated rules the option is a gated semantic predicate
> with either a field or a rule parameters or a scope.
> I think with a field you might run into nesting issues, though not sure
> there.
> With parameters:
> dotOpExpression
>  : (identifier[false] BANG identifier[true])=>identifier BANG
> dotOpExpression
>  | unaryOps (DOT^ dotOperand?)
>  ;
>
> identifier[boolean allowBang]
>  : 'a'..'z'+
>  ( {allowBang}?=>BANG
>  | // Epsilon
>  )
>  ;
>
> Though then you have to always pass allowBang to your identifier rule,
> and will need to pass it down through various rules to get to
> identifier.
>
> You might be able to use scopes, but I think then you'd need to put
> them in a rule that all calls to identifier went through or else they
> wouldn't exist. So, I don't think dynamic scopes are suitable (I
> assume there is access to identifier not through dotOpExpression).
> Maybe you could add:
> scope IdentiferBang {
> boolean allow;
> }
> Then do:
> start
> scope identifierBang;
> @init {
>  identifierBang::allow = true;
> }
>  : ...
>  ;
>
> dotOpExprssion
> scope identifierBang;
>  : { identifierBang::allow = false; }
>  (unaryOps BANG anyName)=>unaryOps BANG^ dotOpExpression
>  | { identifierBang::allow = true; }
>  unaryOps (DOT^ dotOperand?)
>  ;
>
> identifier
>  : 'a'..'z'+
>  ( { identifierBang::allow }?=>BANG
>  | // Epsilon
>  )
> So if there's a call to dotOpExpression on the way to identifier it
> will get that copy of the scope, otherwise it will get the default
> copy of the scope from the start rule.
> Not especially clean, but it might work in lieu of a nicer solution.
>
> Tom.
>
>
>  Any ideas?
> Thanks,
> Andy
>
>
>
>
>
>
>
>
>


More information about the antlr-interest mailing list