[antlr-interest] Trouble using backtracking (I think)
Ron Hunter-Duvar
ron.hunter-duvar at oracle.com
Tue Apr 22 11:34:42 PDT 2008
Just a guess, but perhaps because the LL(*) look-ahead sees that it can
resolve this, it never even tries backtracking. It may only backtrack
when it encounters a situations where LL(*) fails. Hopefully someone
more knowledgeable can jump in and settle this, though. I'm curious.
Ron
John Woods wrote:
> Thanks Ron,
>
> I see your point about preferring to place longer alternatives first
> for performance reasons. And making the rearrangements you suggest
> does make my parser work.
>
> But I'm curious if anyone could explain why rearranging the order of
> alternatives would affect what input can be parsed when backtracking
> is enabled. I understand that the order of alternatives will affect
> which of several possible alternatives will be matched first. But in
> the case where the first alternative doesn't match, and the second
> does, I'm wondering why backtracking doesn't come back and try the
> second alternative. Perhaps, as you suggest, there's a limit to how
> much backtracking will back track.
>
>
> -----Original Message-----
> From: Ron Hunter-Duvar
> Sent: 04/22/2008 09:20 AM
>> I haven't worked with backtracking yet, but in general, your longer
>> alternatives should come first. I think if you swap the order of the
>> alternatives in the expression and lower_term rules it will work.
>>
>> I'm not sure how far back backtracking will go in order to try
>> matching the input (backtracking all the way to the beginning and
>> trying every possibility can lead to exponential parse times). In
>> this case, expression successfully matches the first alternative,
>> lower_term. Given this match, it's looking for the closing bracket
>> back up in the input rule. I don't know at this point if it will then
>> throw away the successful expression match and try the second
>> alternative.
>>
>> Is there a good reason why you want to try the shorter alternative
>> first, and backtrack if that ends up causing a downstream failure?
>>
>> Ron
>>
>>
>> John Woods wrote:
>>>
>>> I'm having trouble figuring out why my grammar isn't working as
>>> expected, and would greatly appreciate any pointers on what I'm
>>> misunderstanding.
>>>
>>> My generated parser behaves as if backtracking isn't working as I
>>> would expect. And further, when I modify the grammar in a way that I
>>> wouldn't expect to have an effect, backtracking appears to start
>>> working.
>>>
>>> I've simplified the grammar as much as I could and still reproduce
>>> the problem. And the simplified input I'm trying to parse is "(a =
>>> b)" (without the quotes).
>>>
>>> Here's the test grammar:
>>>
>>> grammar Test;
>>>
>>> options {
>>> language = Java;
>>> output = AST;
>>> backtrack = true;
>>> }
>>>
>>> input
>>> : expression
>>> | '(' input ')'
>>> ;
>>>
>>> expression
>>> // Swapping the following two lines makes it
>>> // work for input "(a = b)". But why won't it
>>> // work with backtracking the way it is?
>>> : lower_term
>>> | term '=' term
>>> ;
>>>
>>> term
>>> : lower_term
>>> | UPPER
>>> ;
>>>
>>> lower_term
>>> : LOWER
>>> // Removing "term*" from the following line, or
>>> // removing the line altogether makes it work.
>>> // Why is that? Seems unrelated to given input.
>>> | LOWER '(' term* ')'
>>> ;
>>>
>>> UPPER: 'A'..'Z';
>>> LOWER: 'a'..'z';
>>>
>>> WHITESPACE: (' '|'\t'|'\n'|'\r'|'\f') { skip(); };
>>>
>>> When I try to parse the input "(a = b)", I get the following error:
>>>
>>> BR.recoverFromMismatchedToken
>>> line 1:3 mismatched input '=' expecting ')'
>>>
>>> It appears that the parser attempts to parse the "expression" rule
>>> using the "lower_term" alternative only, and when that fails, it
>>> seems like backtracking isn't coming back and attempting the other
>>> alternative of "term '=' term".
>>>
>>> I'm using ANTLR version 3.0.1 (August 13, 2007).
>>>
>>> Finally, here's the test driver:
>>>
>>> import java.io.*;
>>> import org.antlr.runtime.*;
>>> import org.antlr.runtime.tree.*;
>>>
>>> public class Test {
>>> public static void main(String[] args) throws Exception {
>>> String string = "(a = b)";
>>> ByteArrayInputStream byteArrayInputStream =
>>> new ByteArrayInputStream(string.getBytes());
>>> ANTLRInputStream antlrInputStream =
>>> new ANTLRInputStream(byteArrayInputStream);
>>> TestLexer testLexer =
>>> new TestLexer(antlrInputStream);
>>> CommonTokenStream commonTokenStream =
>>> new CommonTokenStream(testLexer);
>>> TestParser testParser =
>>> new TestParser(commonTokenStream);
>>> CommonTree commonTree =
>>> (CommonTree)testParser.input().getTree();
>>> System.out.println(commonTree.toStringTree());
>>> }
>>> }
>>>
>>
>
>
--
Ron Hunter-Duvar | Software Developer V | 403-272-6580
Oracle Service Engineering
Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
All opinions expressed here are mine, and do not necessarily represent
those of my employer.
More information about the antlr-interest
mailing list