[antlr-interest] robust error recovery

Greg Smolyn greg at smolyn.org
Wed Aug 19 11:07:26 PDT 2009


Hi,

I've been using ANTLR quite successfully in a number of places in a  
fairly large product-- mainly for parsing CSS and Javascript.

However, I've recently tried to make my grammars more resilient to  
errors and recover a bit more gracefully, and I've encountered a lot  
of difficulty in finding information both in the ANTLR book (which  
pretty much only describes how to _print_ errors) and in the mailing  
list archives.  Mainly, I'd like to be able to catch a parsing  
exception, skip it, but _continue_ to parse the rest of the rule.   
Unfortunately, the way the exceptions are structured, there is no way  
to catch an exception in a rule and carry on-- you're already outside  
the main loop of that rule.

Here's a scenario I've encountered-- I'll use CSS as an example.

Say you're given a CSS ruleset with a bad property:

a {
      margin: 100px;
     *margin: 200px;
     background: foo.gif;
}

What I'd like to do is have it throw the exception and skip the  
property, but continue along with the next one.  I am stymied by how  
to do this, however.

All I've been able to manage is to skip the rule altogether, thus  
losing even the already properly parsed... here's the relevant bits of  
grammar (before I tried to modify it to skip just the single property):

<===== cut here ======>

ruleset
  	: selectors '{' properties RCURLY -> ^( RULE selectors properties )
	;
     catch[RecognitionException re]
	{
		ConsumeUntil(input, RCURLY);
		input.Consume();
	}

properties
	: declaration (SEMI declaration?)* ->  ^(PROPERTIES declaration*)
         |  -> PROPERTIES_EMPTY
         ;
     catch[RecognitionException re]
	{
		ConsumeUntil(input, RCURLY);
                 if(input.LA(1) != RCURLY)
                 {
                     throw;
                 }
		return properties();
	}
	

declaration
	: IDENT ':' args -> ^( PROPERTY IDENT args )
	;

<===== cut here ======>


I attempted to add a catch clause on declaration, with an appropriate  
ConsumeUntil(SEMI)--  however the issue is that it's in fact the  
'properties' rule that throws the exception, as it cannot find the  
right starting token for a 'declaration'.

So, of course, we can change the properties rule to instead  
ConsumeUntil(SEMI), but then what?  How do I convince it to continue  
looping through the (SEMI declaration?)* clause?  Even if I split it  
out into more rules, I have the same problem--  there is a list of  
declarations, but it's the rule containing the list that throws the  
exception--  since it sees the '*' before margin, and can't descend  
into declaration.  But that exception occurs outside of the looping of  
the rule, and thus I'm stuck.

Any help would be greatly greatly appreciated--  I'd really like to  
know a lot more about how to handle errors, insert/delete bits of tree  
into the AST in these exception scenarios.

Thanks so much!
-greg





More information about the antlr-interest mailing list