[antlr-interest] Re: patching a tree (recoverability)

lgcraymer lgc at mail1.jpl.nasa.gov
Fri Nov 19 17:17:01 PST 2004



--- In antlr-interest at yahoogroups.com, "Paul J. Lucas"
<pauljlucas at m...> wrote:
> 	Suppose I want to parse and compile as much as possible.  An
> 	example is having:
> 
> 		funcDecl
> 		    : DECLARE! FUNCTION^ IDENT '('! paramList ')'! funcBody ';'!
> 		    	{
> 			    ##.setType( FUNC_DECL );
> 			}
> 		    ;
> 
> 	I want to recover if something doesn't parse in funcBody and
> 	produce a tree not having a body.  The reason is that the
> 	compiler (tree-walker) could compile the program from the AST as
> 	much as possible.  In this case, it could at least enter the
> 	function signature for the failed-to-compile function into the
> 	symbol table so that some later function can call the failed
> 	function and not generate a "function not declared" error.

Recovery is your first tough problem.  ANTLR error handling will allow
you to intercept the error within your grammar.  The next problem is
to resynch the token stream.  If you are looking at a fragment A B C D
and an error is thrown because C is unexpected, then you still have D
to deal with if it was part of the syntactic construct being matched
(that is, you need to throw away D because you can recognized E F G).
 So you will usually need to consume a few tokens before getting back
in synch.  Resynching is very syntax-dependent; if you look at
antlr.g, you can see places where a few abortive attempts were made to
achieve grammar error recovery.

> 	But how to communicate the "failed-ness" from the parser to the
> 	tree-parser?  Is there a standard-practice "ANTLR way" to do
> 	this?  If not, I've been thinking along the lines of
> 	introducing an "ERROR" token (yes, like yacc) and "patching" it
> 	into the tree.
>
> 		funcBody
> 		    : '{'! expr '}'!
> 		    ;
> 		    exception
> 		    catch [ TokenStreamRecognitionException e ] {
> 		    	## = #([ERROR,"ERROR"]);
> 		    }
> 
> 	Then in the tree-parser I can do:
> 
> 		functionDecl
> 		    : #( FUNC_DECL IDENT paramList
> 			( funcBody
> 			    {
> 			    	// The normal case
> 			    }
> 			| ERROR
> 			    {
> 			    	// At least enter the signature into the
> 				// symbol table.
> 			    }
> 			)
> 		       )
> 		    ;
> 
> 	Comments?

This could work--it depends very much on the language problem you are
addressing.  In general, once you have managed to resynch the token
stream, you are in unknown territory from an ANTLR perspective.

--Loring

> 	- Paul





 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the antlr-interest mailing list