[antlr-interest] Trouble using backtracking (I think)

Daniels, Troy (US SSA) troy.daniels at baesystems.com
Tue Apr 22 13:14:49 PDT 2008


I think the problem is that antlr doesn't do backtracking over the full
tree, but only within a rule.  Given your input of (a=b), we start with
these rules:

input("(a=b)")
  input("a=b)")
    expression("a=b)")

where the arguments are the unparsed tokens, written as a text string.
If lower_term failed to match against "a=b)", it'd try the second
alternative.  However, it does match and control returns to the input
rule with "=b)" as unparsed tokens.  This fails to match the expected
")".  The backtracking that you want would at this point retry
expression and force it to follow a different path than it did the last
time.  Antlr has a simpler backtracking, which just tries the next
alternative at the input rule.  It's already on the last alternative, so
it must fails.

The case where antlr backtracking works is where you have a rule that
wants to match a method call or a variable reference of arbitrary depth:
a.b.c.d.e() or a.b.c.d.e.  Something like

rule: methodCall | variableReference ;

methodCall: ID (DOT ID)* "()" ;

variableReference: ID (DOT ID)* ;

If your input is "a.b.c.d.e = 3", methodCall will match "a.b.c.d.e" but
then fail trying to match "()" against "=3", at which point rule will
try to match it against variableReference (which will succeed).  Note
that even in this case, order is important, since if you use
"variableReference | methodCall", variableReference will always match
(if either one can), methodCall will never be called and any call to
rule will fail (even with backtracking) if the input is a method call.

Troy

> 
> I'm having trouble figuring out why my grammar isn't working as
> expected, and would greatly appreciate any pointers on what I'm
> misunderstanding.
> 
> My generated parser behaves as if backtracking isn't working as I
would
> expect. And further, when I modify the grammar in a way that I
wouldn't
> expect to have an effect, backtracking appears to start working.
> 
> I've simplified the grammar as much as I could and still reproduce the
> problem. And the simplified input I'm trying to parse is "(a = b)"
> (without the quotes).
> 
> Here's the test grammar:
> 
>      grammar Test;
> 
>      options {
>          language = Java;
>          output = AST;
>          backtrack = true;
>      }
> 
>      input
>          : expression
>          | '(' input ')'
>          ;
> 
>      expression
>          // Swapping the following two lines makes it
>          // work for input "(a = b)". But why won't it
>          // work with backtracking the way it is?
>          : lower_term
>          | term '=' term
>          ;
> 
>      term
>          : lower_term
>          | UPPER
>          ;
> 
>      lower_term
>          : LOWER
>          // Removing "term*" from the following line, or
>          // removing the line altogether makes it work.
>          // Why is that? Seems unrelated to given input.
>          | LOWER '(' term* ')'
>          ;
> 
>      UPPER: 'A'..'Z';
>      LOWER: 'a'..'z';
> 
>      WHITESPACE: (' '|'\t'|'\n'|'\r'|'\f') { skip(); };
> 
> When I try to parse the input "(a = b)", I get the following error:
> 
>      BR.recoverFromMismatchedToken
>      line 1:3 mismatched input '=' expecting ')'
> 
> It appears that the parser attempts to parse the "expression" rule
using
> the "lower_term" alternative only, and when that fails, it seems like
> backtracking isn't coming back and attempting the other alternative of
> "term '=' term".
> 
> I'm using ANTLR version 3.0.1 (August 13, 2007).
> 
> Finally, here's the test driver:
> 
>      import java.io.*;
>      import org.antlr.runtime.*;
>      import org.antlr.runtime.tree.*;
> 
>      public class Test {
>          public static void main(String[] args) throws Exception {
>              String string = "(a = b)";
>              ByteArrayInputStream byteArrayInputStream =
>                  new ByteArrayInputStream(string.getBytes());
>              ANTLRInputStream antlrInputStream =
>                  new ANTLRInputStream(byteArrayInputStream);
>              TestLexer testLexer =
>                  new TestLexer(antlrInputStream);
>              CommonTokenStream commonTokenStream =
>                  new CommonTokenStream(testLexer);
>              TestParser testParser =
>                  new TestParser(commonTokenStream);
>              CommonTree commonTree =
>                  (CommonTree)testParser.input().getTree();
>              System.out.println(commonTree.toStringTree());
>          }
>      }



More information about the antlr-interest mailing list