[antlr-interest] Problem with semantic predicates

Mon Jul 28 02:18:28 PDT 2008

Hi,

I have to write a grammar for an existing language that allows keywords 
to be used as variables. I've been able to write most of the grammar 
using semantic predicates. But now I've been running into a problem 
where I don't know whether I'm using semantic predicates the wrong way 
or whether it is a bug in ANTLR. I have reduced the problem to the 
following grammar:

grammar foo;

options
{
    ASTLabelType = CommonTree;
    output = AST;
    language = Java;
}

master   
    :    foo
    |    bar
    |    blup
    ;

foo     :    {input.LT(1).getText().equals("FOO")}? IDENTIFIER LBRACE 
IDENTIFIER RBRACE
    ;

bar     :    {input.LT(1).getText().equals("BAR")}? IDENTIFIER (LBRACE 
IDENTIFIER RBRACE)?
    ;

blup     :    {input.LT(1).getText().equals("BLUP")}? IDENTIFIER
    ;

LBRACE     :    '(';
RBRACE    :    ')';

IDENTIFIER
    :    'A'..'Z'*;

The rules 'foo' and 'bar' are almost identical. The only differences are 
the semantic predicates and the fact that in the rule 'bar' the braced 
identifier is optional.
Evaluating a string "BAR(HI)" using this grammar starting with the rule 
'master' leads to an error since the rule 'foo' is chosen instead of 
rule 'bar'. Having a look at the generated JAVA code shows following logic:

    // $ANTLR start master
    // C:\\Temp\\foo.g:10:1: master : ( foo | bar | blup );
    public final master_return master() throws RecognitionException {
        master_return retval = new master_return();
        retval.start = input.LT(1);

        CommonTree root_0 = null;

        foo_return foo1 = null;

        bar_return bar2 = null;

        blup_return blup3 = null;

        try {
            // C:\\Temp\\foo.g:11:5: ( foo | bar | blup )
            int alt1=3;
            int LA1_0 = input.LA(1);

            if ( (LA1_0==IDENTIFIER) ) {
                int LA1_1 = input.LA(2);

                if ( (LA1_1==LBRACE) ) {
                    alt1=1;
                }
                else if ( (input.LT(1).getText().equals("BAR")) ) {
                    alt1=2;
                }
                else if ( (input.LT(1).getText().equals("BLUP")) ) {
                    alt1=3;
                }
                else {
                    NoViableAltException nvae =
                        new NoViableAltException("10:1: master : ( foo | 
bar | blup );", 1, 1, input);

                    throw nvae;
                }
            }
           ......

The first "if" checks whether the current token is an identifier. This 
is ok. But the next "if" statement checks whether the next token is a 
left brace. If true it decides for the sub rule "foo" without evaluating 
the semantic predicates. Is this a bug in ANTLR or am I using semantic 
predicates the wrong way?

Any help appreciated.

Regards,
Thomas

-- 
Interactive Objects Software GmbH
Basler Strasse 61
79100 Freiburg, Germany

Phone:  +49 761 400 73 0
mailto:thomas.woelfle at interactive-objects.com

------------------------------------------------------------------------

Interactive Objects' Legacy Modernization Solutions 

Get Your Applications SOA-Ready!

See http://www.interactive-objects.com/ for more information.

------------------------------------------------------------------------

Interactive Objects Software GmbH - Freiburg - Geschäftsführer: Thomas Wager - AG Frbg. HRB 5810 - USt-IdNr. DE197983057