[antlr-interest] left recursion removal

John B. Brodie jbb at acm.org
Wed Jul 6 15:14:16 PDT 2011


Greetings!

On Wed, 2011-07-06 at 21:19 +0200, Sébastien Kirche wrote:
> Hi,
> 
> the language for which I am trying to build the grammar has 2 flavors
> of if-then-else constructs
> - a single line : if <condition> then <statement> [else <statement>]
> - a multi line : if <condition> then <statements> [else <statements>] end if
> 
> I have defined the following (partial) :
> 
> codeBlock
> 	:	(compoundStatement)*
> 	;
> 
> compoundStatement
> 	:	(
> 			ifStatement
> 		|	singleStatement
> 		) (';' | EOL)
> 	;
> 	
> singleStatement
> 	:	localVariableDeclaration
> 	|	funCall
> 	|	'return' expression
> 	;
> 
> ifStatement
> 	:	singleLineIf
> 	|	multiLineIf
> 	;
> 
> singleLineIf
> 	:	'if' expression 'then' singleStatement EOL
> 	;
> 
> multiLineIf
> 	:	'if' expression 'then' codeBlock 'end if'
> 	;
> 
> 
> I understand why the naive ifStatement fails with the following "
> [fatal] rule ifStatement has non-LL(*) decision due to recursive rule
> invocations reachable from alts 1,2.  Resolve by left-factoring or
> using syntactic predicates or using backtrack=true option."

unable to reproduce.

given your admittedly partial grammar, i tried to construct a complete
example by adding the missing elements and creating an AST (so i could
know the resultant parse).

my test rig is attached.

it runs without error when Tool'd, compiled, and executed from the
command-line (FWIW i use Ubunto 11.04 Linux running Sun Java 6 and the
Antlr version from the antlr-3.4-complete.jar file).

Please try to post the *smallest* yet *complete* example of your
problem.

Hope this helps...
   -jbb

> 
> I would like to avoid general backtracking, so after searching for a
> while and reading the article
> http://www.antlr.org/wiki/display/ANTLR3/How+to+remove+global+backtracking+from+your+grammar
> I have tried first
> 
> ifStatement
> 	:	'if' expression 'then' (singleStatement EOL)=> singleLineIf
> 	| 	multiLineIf
> 	;
> 
> or
> 
> ifStatement
> 	:	'if' expression 'then' (singleStatement EOL | codeBlock 'end if')
> 	;
> 
> But they fail both with the same fatality.
> How this case should be processed ?


-------------- next part --------------
grammar Test;

options {
   output = AST;
   ASTLabelType = CommonTree;
}

tokens { SINGLE; MULTI; } // imaginary tokens go here

@members {
   private static final String [] x = new String[] {
      "local x\n",
      "if x then return 1\n\n",
      "if x then if y then return 1\n\nend if\n",
      "local x;if x then return 1\n;",
  };

   public static void main(String [] args) {
      for( int i = 0; i < x.length; ++i ) {
         try {
            System.out.println("about to parse:`"+x[i]+"`");

            TestLexer lexer = new TestLexer(new ANTLRStringStream(x[i]));
            CommonTokenStream tokens = new CommonTokenStream(lexer);
            TestParser parser = new TestParser(tokens);
            TestParser.test_return p_result = parser.test();

            // System.out.format("the token stream:\%n");
            // for( int j = 0; j < tokens.size(); ++j ) {
            //    Token token = tokens.get(j);
            //    System.out.format("\%d: type = \%s, text = `\%s`\%n",
            //                      j,
            //                      tokenNames[token.getType()],
            //                      token.getText());
            // }

            CommonTree ast = p_result.tree;
            if( ast == null ) {
               System.out.println("resultant tree: is NULL");
            } else {
               System.out.println("resultant tree: " + ast.toStringTree());
            }
            System.out.println();
         } catch(Exception e) {
            e.printStackTrace();
         }
      }
   }
}

test : codeBlock EOF! ;

codeBlock
        :       (compoundStatement)*
        ;

compoundStatement
        :       (
                        ifStatement
                |       singleStatement
                ) (';' | EOL)
        ;
        
singleStatement
        :       localVariableDeclaration
        |       funCall
        |       'return'^ expression
        ;

ifStatement
        :       singleLineIf
        |       multiLineIf
        ;

singleLineIf
        :       'if' expression 'then' singleStatement EOL
      -> ^(SINGLE expression singleStatement)
        ;

multiLineIf
        :       'if' expression 'then' codeBlock 'end if'
      -> ^(MULTI expression codeBlock)
        ;

localVariableDeclaration : 'local'^ ID+ ;
funCall : ID '('^ args? ')'! ;
args : expression (','^ expression)* ;
expression : term (op^ term)* ;
term : ID | NUMBER ;
op : '+' | '-' | '*' | '/' ;

EOL : '\r'? '\n' ;

WS : (' ' | '\t')+ { skip(); };

ID : LETTER ( LETTER | DIGIT )* ;

NUMBER : DIGIT+ ;

fragment LETTER : ('a'..'z')|('A'..'Z') ;
fragment DIGIT : '0'..'9' ;


More information about the antlr-interest mailing list