[antlr-interest] Semantic predicate behaviour with k>1

Jim Idle jimi at temporal-wave.com
Fri Oct 15 16:42:59 PDT 2010


You can't dp that, but you could use options {k=1;} on this rule.

But all your alts  call identifier anyway, so why would you do that?
Predicates are not supposed to have side effects, though I sometimes break
that rule on keyword vs identifier problems.

But it seems you just need to left factor yout parser rule:

identifier ( LBRACKET ... | etc)

It looks to me like you are trying to type in a grammar from the normative
spec of something like Verilog, and do everything in one pass. You need to
parser the common syntax in to a tree, then walk the tree and verify it
(throw out ranges that are not constant when they must be etc). Don't try to
reject semantic errors in the parser basically.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of A Z
> Sent: Friday, October 15, 2010 3:40 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Semantic predicate behaviour with k>1
> 
> Hello,
> 
>   I am seeing ANTLR generate unexpected code when using semantic
> predicates and am wondering if my grammar or understanding is
> incorrect. The EBNF has a rule similar to the following:
> 
> rule :
>     primary_literal
>   | {isIdent(LT(1)->getText(LT(1)),PARAM_IDENT)}?     identifier
> LBRACKET?
>   | {isIdent(LT(1)->getText(LT(1)),SPECPARAM_IDENT)}? identifier
> (LBRACKET constant_range_expression RBRACKET)?
>   | {isIdent(LT(1)->getText(LT(1)),TYPE_IDENT)}?      identifier
> APOSTROPHE
>   | {isIdent(LT(1)->getText(LT(1)),ENUM_IDENT)}?      identifier
>   | {isIdent(LT(1)->getText(LT(1)),GENVAR_IDENT)}?    identifier
>   | {isIdent(LT(1)->getText(LT(1)),LET_IDENT)}?       identifier
> LPARAN?
>   | {isIdent(LT(1)->getText(LT(1)),GENBLOCK_IDENT)}?  identifier
> (LBRACKET constant_expression RBRACKET)? PERIOD
>   | {isIdent(LT(1)->getText(LT(1)),PACKAGE_IDENT)}?   identifier
> COLONCOLON
> constant_primary_package_scope_suffix
>   | identifier ((LPARAN list_of_arguments RPARAN)=> LPARAN
> list_of_arguments RPARAN)?// tf_call
> 
> The last identifier type can be forward declared so that type is
> assumed if the identifier at this point is undefined. I previously had
> tried doing this by factoring but it makes the grammar very difficult
> to follow and substantially increases the number of rules.  With this
> rule ANTLR generates the following:
> 
>                 else if ( (LA1039_0 == SIMPLE_IDENT) )
>                 {
> 
>                     {
>                         int LA1039_2 = LA(2);
>                         if ( (LA1039_2 == LBRACKET || LA1039_2 ==
> PERIOD) )
>                         {
>                             alt1039=8;
>                         }
>                         else if ( (LA1039_2 == APOSTROPHE) )
>                         {
>                             alt1039=4;
>                         }
>                         else if ( (LA1039_2 == COLONCOLON) )
>                         {
>                             alt1039=9;
>                         }
>                         else if (
> ((isIdent(LT(1)->getText(LT(1)),PARAM_IDENT))) )
>                         {
>                             alt1039=2;
>                         }
>                         else if (
> ((isIdent(LT(1)->getText(LT(1)),SPECPARAM_IDENT))) )
>                         {
>                             alt1039=3;
>                         }
>                         else if (
> ((isIdent(LT(1)->getText(LT(1)),ENUM_IDENT))) )
>                         {
>                             alt1039=5;
>                         }
>                         else if (
> ((isIdent(LT(1)->getText(LT(1)),GENVAR_IDENT))) )
>                         {
>                             alt1039=6;
>                         }
>                         else if (
> ((isIdent(LT(1)->getText(LT(1)),LET_IDENT))) )
> 
> The first 3 conditions look out of place. It appears even with
> predicates, ANTLR will increase k if it thinks it can help resolve
> ambiguities. Chapter
> 13 in the book doesn't appear to describe cases like this. The first
> case won't work as three different alternatives match this sequence. If
> I force
> k=1 for this rule, then the code is generated as expected. Strangely,
> removing the PERIOD from the GENBLOCK subrule also works, but breaks
> the grammar. Is this expected behaviour?
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address



More information about the antlr-interest mailing list