[antlr-interest] robust error recovery
Greg Smolyn
greg at smolyn.org
Wed Aug 19 11:07:26 PDT 2009
Hi,
I've been using ANTLR quite successfully in a number of places in a
fairly large product-- mainly for parsing CSS and Javascript.
However, I've recently tried to make my grammars more resilient to
errors and recover a bit more gracefully, and I've encountered a lot
of difficulty in finding information both in the ANTLR book (which
pretty much only describes how to _print_ errors) and in the mailing
list archives. Mainly, I'd like to be able to catch a parsing
exception, skip it, but _continue_ to parse the rest of the rule.
Unfortunately, the way the exceptions are structured, there is no way
to catch an exception in a rule and carry on-- you're already outside
the main loop of that rule.
Here's a scenario I've encountered-- I'll use CSS as an example.
Say you're given a CSS ruleset with a bad property:
a {
margin: 100px;
*margin: 200px;
background: foo.gif;
}
What I'd like to do is have it throw the exception and skip the
property, but continue along with the next one. I am stymied by how
to do this, however.
All I've been able to manage is to skip the rule altogether, thus
losing even the already properly parsed... here's the relevant bits of
grammar (before I tried to modify it to skip just the single property):
<===== cut here ======>
ruleset
: selectors '{' properties RCURLY -> ^( RULE selectors properties )
;
catch[RecognitionException re]
{
ConsumeUntil(input, RCURLY);
input.Consume();
}
properties
: declaration (SEMI declaration?)* -> ^(PROPERTIES declaration*)
| -> PROPERTIES_EMPTY
;
catch[RecognitionException re]
{
ConsumeUntil(input, RCURLY);
if(input.LA(1) != RCURLY)
{
throw;
}
return properties();
}
declaration
: IDENT ':' args -> ^( PROPERTY IDENT args )
;
<===== cut here ======>
I attempted to add a catch clause on declaration, with an appropriate
ConsumeUntil(SEMI)-- however the issue is that it's in fact the
'properties' rule that throws the exception, as it cannot find the
right starting token for a 'declaration'.
So, of course, we can change the properties rule to instead
ConsumeUntil(SEMI), but then what? How do I convince it to continue
looping through the (SEMI declaration?)* clause? Even if I split it
out into more rules, I have the same problem-- there is a list of
declarations, but it's the rule containing the list that throws the
exception-- since it sees the '*' before margin, and can't descend
into declaration. But that exception occurs outside of the looping of
the rule, and thus I'm stuck.
Any help would be greatly greatly appreciated-- I'd really like to
know a lot more about how to handle errors, insert/delete bits of tree
into the AST in these exception scenarios.
Thanks so much!
-greg
More information about the antlr-interest
mailing list