[antlr-interest] Why does ANTLR generate code that will never call an OR'd alternative?

Avid Trober avidtrober at gmail.com
Sat Aug 21 11:30:06 PDT 2010


> Why can't you something like do:
>
> identifier: i:IDENTIFIER
>	{ if (isToken($i))
>	    { // code here for the isToken case
>	    }
>	  else
>	    { // code here (maybe empty) for the other case
>             }
>	}
>	;
I tried numerous things, but not sure how the above would work.  Wouldn't
the true case simply be IDENTIFER and the false case, too?
Again, all I'm doing is catching predefined tokens and overriding their
precedence as reserved keywords, to treat them as identifiers.


> in order for isToken to be called, the lookahead would have to *not* be an
IDENTIFIER.
My rules were (in a few of my trial & error attempts):

identifier: isToken(...) | IDENTIFIER;

isToken:   {...}? IDENTIFIER;


The lookahead can see if isToken is true, it'll be an IDENTIFIER.  But, how
does it know what's possibly in isToken?  It's an @members function, written
in the target language, and could be just about anything.  Therefore, I
would have expected at least some form of the 'identifier' rule to call
isToken without first forcing IDENTIFIER=true.  Every form I tried autogen'd
code that required IDENTIFER to be true - BEFORE - it would call isToken.
That doesn't make sense to me.

Only when I explicitly listed all the tokens specifiers in the identifier
rule did I get autogen'd code that would call isToken (after that
questionable "if" statement, per below).   What?  Then why not have a
*function* test for those one-in-the-same values so the grammar file is
cleaner, not having to list all the tokens *twice* in the grammar file.  I'm
sure there's a good answer, literal tests vs. a function call.  But,  again,
ANTLR has no idea what code is n that function, so how could it have always
avoided gen'ing a call to it w/o first requiring IDENTIFER to be true?


> In cases like this, I have done:
Thank you...very much.  I will try that.

> This question comes up rather often on this list.
It's easy to find explicit discussion on token specifiers having precedence,
and how to override them for v2.  But, I didn't find anything for v3, other
than one could see it's going to be predicate/action-related to resolve. 

One of  my challenges to finding online help was discussion explicitly
addressing token specifiers precedence.  And, I wasn't sure what to search
on (e.g. your solution is not, explicitly, addressing token specifiers.
Therefore, for an ANTLR shade-tree mechanic like me, I was left with trial &
error debugging the autogen'd code and synthesizing predicate/action/other
stuff into a solution vs. what, I think, should be a quick HOW TO for v3.
Frustrating, because I knew it was 10lbs. of effort for a 2oz. solution. :-)

Your reply is VERY much appreciated.

Regards,
Trober





More information about the antlr-interest mailing list