[antlr-interest] LL(k)

Wed Mar 19 07:13:31 PDT 2008

> On Wed, 19 Mar 2008 13:52:38 +0100
> "Patrick Hulsmeijer" <phulsmeijer at xebic.com> wrote:
>
> Thanks for your information!
> 
> I have trouble with the following (dis-ambiguating) predicates:
> 
> relationalExpression
> 	: shiftExpression ( ( LT^ | GT^ | LTE^ | GTE^ | INSTANCEOF^ |
> { !noIn }?=> IN^ ) shiftExpression )*
> 	;
> 
> The predicate in front of the IN alternative is there to be able to
> turn the IN alternative off because it can be ambiguous in the
> context of another rule. When it should be recognized and is not
> ambiguous the noIn is false, when it is ambiguous noIn is true. noIn
> is just a private field of my parser class. When I debug the code in
> my parser I see that the resulting DFA never predicts the IN
> alternative as viable when I use LL(*), even though noIn is false.

Hello Patrick,

I don't really know why, some ideas:

- I have never tried a semantic predicate in a sub-rule like
that, so I don't know if it works or not.  I guess if you were
curious you could try factoring it something like:

relationalExpression
 	: shiftExpression shiftExpressionRhs*
 	;

shiftExpressionRhs
        : ( LT^ | GT^ | LTE^ | GTE^ | INSTANCEOF^ | { !noIn }?=> IN^ ) shiftExpression
        ;

- maybe the DFA notices that the shiftExpression after the IN does not
match.

> statement
> 	: block
> 	| variableStatement
> 	| emptyStatement
> 	| expressionStatement
> 	| ifStatement
> 	| iterationStatement
> 	| continueStatement
> 	| breakStatement
> 	| returnStatement
> 	| withStatement
> 	| labelledStatement
> 	| switchStatement
> 	| throwStatement
> 	| tryStatement
> 	;
> 	
> block
> 	: LBRACE sourceElement* RBRACE
> 	;
> 
> expressionStatement
> 	: { !(input.LA(1) == LBRACE || input.LA(1) == FUNCTION) }?=>
> expression semic!
> 	;
> 
> The predicate in the expressionStatement is there because the
> expression and the statement rule both have alternatives that have a
> LBRACE and a FUNCTION on the left edge. But when I use LL(*) the DFA
> prediction in the statement rule never considers the block rule as an
> alternative anymore.

It seems confusing to write a semantic predicate which says it
can match an alternative if the input does not start with some tokens.

I think it would be better to instead have semantic predicates on
the other alternative which look for tokens that do match the
alternatives.  The dis-ambiguating semantic predicates can do things
like scan ahead over a function head looking for a token like
{ or ; that indicates whether it is a function declaration or maybe
a function call.

Regards, Mark

> Both situation work fine when I use LL(k) with a k of 2.
> 
> Regards, patrick.

--