[antlr-interest] Why does ANTLR generate code that will never call an OR'd alternative?

Avid Trober avidtrober at gmail.com
Sat Aug 21 00:27:45 PDT 2010


Gerald,

Thank you very much for your reply.

There's no alt skipped message in the error log.

The 'isToken' rule was simply my attempt to have the parser check if the
token was in the tokens { ... } section.  At runtime, I found the token type
to always be the value in the token { ... } section, even if I tried to
change it:

	isToken	:	{isToken(input.LT(1))}? IDENTIFIER;

But, 'isToken' would never get called via the generated code, e.g. 

	identifier  :  isToken | IDENTIFIER;   // i.e. treat a token in the
tokens section as an IDENTIFIER.

Therefore, I modified my 'identifier' rule to have each tokens { ... } value
in it, e.g.

	identifier:
		( 'TOKEN1', 'TOKEN2', ... 'TOKEN_ELEVENTYTEEN_THOUSAND' }  {
input.LT(-1).Type = IDENTIFIER; }
		| IDENTIFIER;

And,  that worked.  That is, if I have "identifier" in the grammar somewhere
it will now accept an IDENTIFIER, as it always has, but also any 'TOKEN1',
'TOKEN2', etc. value found in tokens { ... }

Personally, I hate this.  It means I need *two* places in my grammar to list
the keywords, the tokens { ... } section AND the identifier rule.  I'm sure
there's some way to do it via an action, predicate, whatever.  

I went down this path due to this recommendation: " The author's
recommendation is to use ordinary rules and the tokens command." at
http://www.antlr.org/wiki/display/ANTLR3/Quick+Starter+on+Parser+Grammars+-+
No+Past+Experience+Required. 

It appears the tokens section is NOT the thing to do, perhaps rather to have
per-token rules, e.g. keyToken1, keyToken2, etc.  But, I can't rewrite this
grammar and risk breaking other things.  Perhaps I should in the future.
Preferably, I simply like a way to scan thru the tokens, if found, note it,
then change the token type to IDENTIFIER - without listing all the tokens
twice in the grammar.

Any suggestions very, very welcome. 

Regards,
Trober




-----Original Message-----
From: Gerald Rosenberg [mailto:gerald at certiv.net] 
Sent: Saturday, August 21, 2010 1:35 AM
To: Avid Trober
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Why does ANTLR generate code that will never
call an OR'd alternative?

  Most likely, the parser generation analysis determined that isToken 
can never be reached.  Check your error log for an alt skipped message.



------ Original Message (Saturday, August 21, 2010 1:01:20 
AM) From: Avid Trober ------
Subject: [antlr-interest] Why does ANTLR generate code that will never call
an OR'd alternative?
> For this rule,
>
>
>
> identifier
>
>                  :       isToken | IDENTIFIER;
>
>
>
> ANTLR generates code that would never calls the isToken rule
> (target=CSharp2):
>
>
>
>      public MYParser.identifier_return identifier()    // throws
> RecognitionException [1]
>
>      {
>
> .
>
>              // .  : ( isToken | IDENTIFIER )
>
>              int alt30 = 2;
>
>              int LA30_0 = input.LA(1);
>
>
>
>              if ( (LA30_0 == IDENTIFIER) )   //<== token must be
IDENTIFIER
> to call isToken???
>
>              {
>
>                  int LA30_1 = input.LA(2);
>
>
>
>                  if ( ((isToken(input.LT(1)))) )  //<== why must LA30_0 ==
> IDENTIFIER to call isToken?
>
>                  {
>
>                      alt30 = 1;
>
>                  }
>
>                  else if ( (true) )
>
>                  {
>
>                      alt30 = 2;
>
>                  }
>
> .
>
>              else                         //<== since not IDENTIFIER, why
> not call isToken here???
>
>              {
>
>                  NoViableAltException nvae_d30s0 =
>
>                      new NoViableAltException("", 30, 0, input);
>
>
>
>                  throw nvae_d30s0;
>
>              }
>
>
>
> I would think it's something to do with DFA optimization?   Perhaps that's
> why IDENTIFIER is checked first.
>
> But, if IDENTIFIER is false, why not call isToken???    Afterall, the rule
> is IDENTIFIER  ****OR***** isToken.
>
>
>
> Thanks,
>
> Trober
>
>
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


-- 

Gerald B. Rosenberg, Esq.
NewTechLaw
260 Sheridan Ave., Suite 208
Palo Alto, CA 94306-2009
650.325.2100 (office) / 650.703.1724 (cell)
650.325.2107 (facsimile)

www.newtechlaw.com

CONFIDENTIALITY NOTICE: This email message (including any attachments) 
is being sent by an attorney,
is for the sole use of the intended recipient, and may contain 
confidential and privileged information.
Any unauthorized review, use, disclosure or distribution is prohibited. 
If you are not the intended
recipient, please contact the sender immediately by reply email and 
delete all copies of this message
and any attachments without retaining a copy.



More information about the antlr-interest mailing list