[antlr-interest] Non-determinism when k==2

Gokhan Caglar gcaglar at gmail.com
Wed Mar 15 10:08:16 PST 2006


Here are two grammars which I think are semantically equal, look at
the do loop in the generated code.  I know I can left factor DO's but
I am trying to demonstrate a behavior.  The first one reports a non
determinism:

For the grammar:

options {
    k = 2;
}

batch
    :
        (statement1)*
        (statement2)?
    ;

statement1
    :
        DO THIS
    | BREAK
    ;

statement2
    : DO THAT
    ;


This code gets generated:

            public final void batch() throws RecognitionException,
TokenStreamException {
                        try {      // for error handling
                                    {
                                    _loop6:
                                    do {
                                                if
((LA(1)==DO||LA(1)==BREAK) && (_tokenSet_1.member(LA(2)))) {
                                                            statement();
                                                }
                                                else {
                                                            break _loop6;
                                                }
                                    } while (true);
                                    }
                                    {
                                    switch ( LA(1)) {
                                    case DO:
                                    {
                                                statement2();
                                                break;
                                    }
                                    case EOF:
                                    case GO:
                                    {
                                                break;
                                    }
                                    default:
                                    {
                                                throw new
NoViableAltException(LT(1), getFilename());
                                    }
                                    }
                                    }
                        }
                        catch (RecognitionException ex) {
                                    reportError(ex);
                                    recover(ex,_tokenSet_2);
                        }
            }



For the grammar:

options {
    k = 2;
}

batch
    :
        (statement1 | statement3)*
        (statement2)?
    ;

statement1
    :
        DO THIS
    ;

statement3
    :
     BREAK
    ;

statement2
    : DO THAT
    ;


This code gets generated:

            public final void batch() throws RecognitionException,
TokenStreamException {
                        try {      // for error handling
                                    {
                                    _loop6:
                                    do {
                                                if ((LA(1)==DO) &&
(LA(2)==THIS)) {
                                                            statement1();
                                                }
                                                else if ((LA(1)==BREAK)) {
                                                            statement3();
                                                }
                                                else {
                                                            break _loop6;
                                                }
                                    } while (true);
                                    }
                                    {
                                    switch ( LA(1)) {
                                    case DO:
                                    {
                                                statement2();
                                                break;
                                    }
                                    case EOF:
                                    case GO:
                                    {
                                                break;
                                    }
                                    default:
                                    {
                                                throw new
NoViableAltException(LT(1), getFilename());
                                    }
                                    }
                                    }
                        }
                        catch (RecognitionException ex) {
                                    reportError(ex);
                                    recover(ex,_tokenSet_1);
                        }
            }

It goes back to the first generated code, and reports non-determinism
if we do this:
batch
    :
        ((statement1 | statement3))*  // Adding an extra paranthesis
        (statement2)?
    ;


Thanks,
Gokhan


More information about the antlr-interest mailing list