[antlr-interest] left recursion removal
John B. Brodie
jbb at acm.org
Wed Jul 6 15:14:16 PDT 2011
Greetings!
On Wed, 2011-07-06 at 21:19 +0200, Sébastien Kirche wrote:
> Hi,
>
> the language for which I am trying to build the grammar has 2 flavors
> of if-then-else constructs
> - a single line : if <condition> then <statement> [else <statement>]
> - a multi line : if <condition> then <statements> [else <statements>] end if
>
> I have defined the following (partial) :
>
> codeBlock
> : (compoundStatement)*
> ;
>
> compoundStatement
> : (
> ifStatement
> | singleStatement
> ) (';' | EOL)
> ;
>
> singleStatement
> : localVariableDeclaration
> | funCall
> | 'return' expression
> ;
>
> ifStatement
> : singleLineIf
> | multiLineIf
> ;
>
> singleLineIf
> : 'if' expression 'then' singleStatement EOL
> ;
>
> multiLineIf
> : 'if' expression 'then' codeBlock 'end if'
> ;
>
>
> I understand why the naive ifStatement fails with the following "
> [fatal] rule ifStatement has non-LL(*) decision due to recursive rule
> invocations reachable from alts 1,2. Resolve by left-factoring or
> using syntactic predicates or using backtrack=true option."
unable to reproduce.
given your admittedly partial grammar, i tried to construct a complete
example by adding the missing elements and creating an AST (so i could
know the resultant parse).
my test rig is attached.
it runs without error when Tool'd, compiled, and executed from the
command-line (FWIW i use Ubunto 11.04 Linux running Sun Java 6 and the
Antlr version from the antlr-3.4-complete.jar file).
Please try to post the *smallest* yet *complete* example of your
problem.
Hope this helps...
-jbb
>
> I would like to avoid general backtracking, so after searching for a
> while and reading the article
> http://www.antlr.org/wiki/display/ANTLR3/How+to+remove+global+backtracking+from+your+grammar
> I have tried first
>
> ifStatement
> : 'if' expression 'then' (singleStatement EOL)=> singleLineIf
> | multiLineIf
> ;
>
> or
>
> ifStatement
> : 'if' expression 'then' (singleStatement EOL | codeBlock 'end if')
> ;
>
> But they fail both with the same fatality.
> How this case should be processed ?
-------------- next part --------------
grammar Test;
options {
output = AST;
ASTLabelType = CommonTree;
}
tokens { SINGLE; MULTI; } // imaginary tokens go here
@members {
private static final String [] x = new String[] {
"local x\n",
"if x then return 1\n\n",
"if x then if y then return 1\n\nend if\n",
"local x;if x then return 1\n;",
};
public static void main(String [] args) {
for( int i = 0; i < x.length; ++i ) {
try {
System.out.println("about to parse:`"+x[i]+"`");
TestLexer lexer = new TestLexer(new ANTLRStringStream(x[i]));
CommonTokenStream tokens = new CommonTokenStream(lexer);
TestParser parser = new TestParser(tokens);
TestParser.test_return p_result = parser.test();
// System.out.format("the token stream:\%n");
// for( int j = 0; j < tokens.size(); ++j ) {
// Token token = tokens.get(j);
// System.out.format("\%d: type = \%s, text = `\%s`\%n",
// j,
// tokenNames[token.getType()],
// token.getText());
// }
CommonTree ast = p_result.tree;
if( ast == null ) {
System.out.println("resultant tree: is NULL");
} else {
System.out.println("resultant tree: " + ast.toStringTree());
}
System.out.println();
} catch(Exception e) {
e.printStackTrace();
}
}
}
}
test : codeBlock EOF! ;
codeBlock
: (compoundStatement)*
;
compoundStatement
: (
ifStatement
| singleStatement
) (';' | EOL)
;
singleStatement
: localVariableDeclaration
| funCall
| 'return'^ expression
;
ifStatement
: singleLineIf
| multiLineIf
;
singleLineIf
: 'if' expression 'then' singleStatement EOL
-> ^(SINGLE expression singleStatement)
;
multiLineIf
: 'if' expression 'then' codeBlock 'end if'
-> ^(MULTI expression codeBlock)
;
localVariableDeclaration : 'local'^ ID+ ;
funCall : ID '('^ args? ')'! ;
args : expression (','^ expression)* ;
expression : term (op^ term)* ;
term : ID | NUMBER ;
op : '+' | '-' | '*' | '/' ;
EOL : '\r'? '\n' ;
WS : (' ' | '\t')+ { skip(); };
ID : LETTER ( LETTER | DIGIT )* ;
NUMBER : DIGIT+ ;
fragment LETTER : ('a'..'z')|('A'..'Z') ;
fragment DIGIT : '0'..'9' ;
More information about the antlr-interest
mailing list