[antlr-interest] Why does ANTLR generate code that will never call an OR'd alternative?

Sat Aug 21 11:46:21 PDT 2010

On 08/21/2010 02:30 PM, Avid Trober wrote:
>> Why can't you something like do:
>>
>> identifier: i:IDENTIFIER
>> 	{ if (isToken($i))
>> 	    { // code here for the isToken case
>> 	    }
>> 	  else
>> 	    { // code here (maybe empty) for the other case
>>             }
>> 	}
>> 	;
> I tried numerous things, but not sure how the above would work.  Wouldn't
> the true case simply be IDENTIFER and the false case, too?

You need to understand that a semantic predicate is nothing more than if
IF statement.  Who cares if you do the semantic test before choosing
which sub-rule, or inside the single sub-rule, especially, when both
paths match the *same* input.  In this case, an IDENTIFIER.  But that
doesn't answer question of how do you include your literal tokens in
your identifiers.  I beleive that is where your misunderstanding starts.

> Again, all I'm doing is catching predefined tokens and overriding their
> precedence as reserved keywords, to treat them as identifiers.

Then the method I proposed below is what you want.

>> in order for isToken to be called, the lookahead would have to *not* be an
> IDENTIFIER.
> My rules were (in a few of my trial & error attempts):
> 
> identifier: isToken(...) | IDENTIFIER;
> 
> isToken:   {...}? IDENTIFIER;
> 
> 
> The lookahead can see if isToken is true, it'll be an IDENTIFIER.  But, how
> does it know what's possibly in isToken?  It's an @members function, written
> in the target language, and could be just about anything.  Therefore, I
> would have expected at least some form of the 'identifier' rule to call
> isToken without first forcing IDENTIFIER=true.  Every form I tried autogen'd
> code that required IDENTIFER to be true - BEFORE - it would call isToken.
> That doesn't make sense to me.
> 
> Only when I explicitly listed all the tokens specifiers in the identifier
> rule did I get autogen'd code that would call isToken (after that
> questionable "if" statement, per below).   What?  Then why not have a
> *function* test for those one-in-the-same values so the grammar file is
> cleaner, not having to list all the tokens *twice* in the grammar file.  I'm
> sure there's a good answer, literal tests vs. a function call.  But,  again,
> ANTLR has no idea what code is n that function, so how could it have always
> avoided gen'ing a call to it w/o first requiring IDENTIFER to be true?
> 
> 
>> In cases like this, I have done:
> Thank you...very much.  I will try that.
> 
>> This question comes up rather often on this list.
> It's easy to find explicit discussion on token specifiers having precedence,
> and how to override them for v2.  But, I didn't find anything for v3, other
> than one could see it's going to be predicate/action-related to resolve. 

Yes, you'll need to find the correct method to change the token's type
in your parser rule.

> One of  my challenges to finding online help was discussion explicitly
> addressing token specifiers precedence.  And, I wasn't sure what to search
> on (e.g. your solution is not, explicitly, addressing token specifiers.
> Therefore, for an ANTLR shade-tree mechanic like me, I was left with trial &
> error debugging the autogen'd code and synthesizing predicate/action/other
> stuff into a solution vs. what, I think, should be a quick HOW TO for v3.
> Frustrating, because I knew it was 10lbs. of effort for a 2oz. solution. :-)
> 
> Your reply is VERY much appreciated.

Good luck, and let us know how it turns out for you.

> Regards,
> Trober

-- 
Kevin J. Cummings
kjchome at rcn.com
cummings at kjchome.homeip.net
cummings at kjc386.framingham.ma.us
Registered Linux User #1232 (http://counter.li.org)