[antlr-interest] Error recovery contortion

Paul J. Lucas pauljlucas at mac.com
Thu Dec 2 12:01:09 PST 2004


	So my problem is that I want to do error recovery.  (I've
	posted some things about this before.)

	Some background: the language I'm parsing is XQuery
	<http://www.w3.org/TR/xquery/> that, among other annoyances, is
	keyword-free.  This makes recovery much harder because the lexer
	is stateful.

	As a first pass, I want to recover from syntax errors inside
	function declarations only.  I can't simply use ANTLR's default
	error-recovery mechanism because I have to sync to a known
	token and reset the lexer's state.  (ANTLR's default mechanism
	sync to one of the tokens in the follow set.)  Function
	declarations in XQuery end with a ';' so, upon error, I throw
	away all tokens until I get to that.  (I will hopefully be able
	to imporove this in the furture, but for now, it's good
	enough.)

	Setting defaultErrorHandler=false makes this work fine for
	syntax errors inside function declarations.  I have something
	like this:

		functionDeclBody
		    : enclosedExpr
		    ;
		    exception
		    catch [ RecognitionException re ] {
		        ## = #([ERROR,"ERROR"]);
			recover( re );
		    }

	where recover() is my own, working recovery function.  Hence,
	if an exception is thrown during enclosedExpr, it will be
	caught and recovered from and the generated AST is just fine.
	So far, so good.

	But, if there's a syntax error *outside* a function
	declaration, the generated AST is trashed.  Another requirement
	is that I keep the generated AST up to the point of the error
	outside a function declaration.  As I've mentioned previously,
	the reason the AST gets trashed is because when an exception is
	thrown and there's no recovery in place, the AST isn't stitched
	together because it's done only upon successful function
	*return*: stack unwinding upon an exception bypassed normal
	function returns.

	OK, so I tried setting defaultErrorHandler=true.  This makes
	the generated AST be fine for errors outside of function
	declarations, but now the problem is that ANTLR recovers all by
	itself while doing enclosedExpr and functionDeclBody above is
	never given the opportunity to catch the exception and do the
	correct recovery.  Hence, this breaks my recovery mechanism.

	Sigh...

	So I looked at the ANTLR-generated Java code: it calls
	reportError() during its own error recovery.  So what I need to
	do is continue to allow it to recover as normal (so my AST is
	preserved) *except* when the current call stack contains
	functionDeclBody, i.e., if reportError() is called "through"
	functionDeclBody, do my own recovery instead.  OK, so set a flag
	in my parser:

		functionDeclBody
		{
		    m_recoverable = true;
		}
		    : enclosedExpr
		        {
		            m_recoverable = false;
		        }
		    ;
		    exception
		    catch [ RecognitionException re ] {
		        ## = #([ERROR,"ERROR"]);
			recover( re );
		    }

	and override reportError() like:

		public void reportError( RecognitionException re ) {
		    final boolean recoverable = m_recoverable;
		    m_recoverable = false;
		    if ( recoverable )
		       throw new ANTLR_WorkaroundException( re );

		    // ... other recovery not relevant to this post ...
		}

	i.e., if I'm doing my own recovery, I want any exception caught
	by ANTLR's recovery mechanism to be rethrown so the stack
	unwinds back up to functionDeclBody.  One slight problem:
	reportError() isn't declared to throw any exception.  Hence, I
	created the ANTLR_WorkaroundException class that extends
	RuntimeException to work around this annoyance.

	OK, I'm pretty sure this all works, but it requires a lot of
	programming contortion, more than should be necessary.

	A suggestion is to change the default exception-handling code
	emitted to something like:

		catch ( RecognitionException ex ) {
		    reportError( ex );
		    recover( ex, _someTokenSet );
		}

	where recover() is a new method in Parser.java that, by
	default, is:

		void recover( RecognotionException ex, BitSet set )
		    throws TokenStreamException
		{
		    consume();
		    consume( set );
		}

	This will allow a user to override what recovery does without
	having to use the hack of stuffing such code into reportError()
	(where it doesn't conceptually belong).

	- Paul



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the antlr-interest mailing list