[antlr-interest] Re: More Error Handling and Recovery

antlrlist antlrlist at yahoo.com
Wed Apr 16 11:42:13 PDT 2003


I see that you have some wrong concepts.

I'll begin assuming that you're not turning off the
defaultErrorHandler option.

The error handler that you use in "startRule" does nothing, because is
exactly the same that ANTLR generates by default (this is, when
defaultErrorHandler=true -by default it is-).

By the way, ANTLR generates that handler for EVERY rule in your
grammar. This is, "line" has an error handler, "date" has an error
handler, and so on.

The default error handler is allways the same:

void  myRule()
{
  try {
   ... // <= Normal recogniton
  } catch (RecognitionException re) { //
    reportError(re);                  //  <= Default error handler
    consume();                        //
    consumeUntil(???);                //
  }
}

Now, what is ??? ?. It's the FOLLOW set of myRule. It's codified in a
BitSet (antlr.Collections.impl.BitSet) object.

If you want to use BitSets in your actions or in your exception
handlers, you can use $FIRST and $FOLLOW. $FIRST(rulename) maps to
"the FIRST set of rulename". $FOLLOW works the same way. If you use
$FIRST alone (no args) you're meaning "the FIRST set for the rule
where the exception handler is".

So for your startRule you could write something like this:

startRule
     :  ( line )+ EOF
     ;
     exception // for rule
     catch [RecognitionException ex] {
     reportError(ex); // Console.Error.WriteLine("exception: "+ex);
     consume();
     consumeUntil($FOLLOW); // <= HERE!
     }

You can use other sets instead of $FOLLOW, like $FOLLOW(startRule).
You should never use _tokenSet_XX for that, 'cause that's just
unstable. If you can not use directly $FOLLOW or $FIRST, you can:
a) Create a bitset {BitSet b=new BitSet();},
   add the tokens you need {b.add(IDENT); b.add(WS); ...}
   and then consume {consumeUntil(b);}
or b) Create a bitset with boolean operations like "and" and "or"
   { BitSet b = $FIRST.or($FOLLOW); }
   and then consume {consumeUntil(b);}

The point is that startRule's error handler helps you very little.
You'll wait for FOLLOW(startRule). And while waiting the recognition
process will end. This will happen for two reasons:
FOLLOW(startRule)'s an empty set, and the recognition is already over
when you reach startRule's error handler.

Here are my advices:

1- Familiarize yourself with FOLLOW and FIRST (I mean the abstract
concepts, not ANTLR's commands!)
2- Familiarize yourself with how ANTLR implements error recovery. The
file err.htm in the docs is a good start.
3- Have a look at the code ANTLR generates for your analyzer, until
you understand why the error handler you've used in startRule is
(sorry) useless.

Normally error recovery is a very long task that envolves lots of
code-tweaking and code generation understanding; this is, nothing that
you can do with a single exception handler in a sigle rule.

In other order of things, why are you recognizing EOF and WS? You're
using a Lexer and a Parser, right?

I hope I could help you. Cheers!

Enrique





--- In antlr-interest at yahoogroups.com, "madison_stjames"
<madison_stjames at y...> wrote:
> Ok, I think I've almost got it:
> 
> Here's my start rule:
>  startRule
>     :  ( line )+ EOF
>     ;
>     exception // for rule
>     catch [RecognitionException ex] {
>     Console.Error.WriteLine("exception: "+ex);
>     consume();
>     consumeUntil(tokenSet_0_);
>     }
> 
> What I want to do is resume the line production, upon encountering 
> an unrecognized token. The error is written out, and parsing 
> continues with the next line in the file.
> 
> The line production composed of sub-productions as follows:
> 
> line
>     : ( date WS time WS cip WS csusername WS sip WS sport WS 
> csmethod WS csuristem WS csuriquery WS scstatus WS csuseragent (WS)?
>         { 
>            Console.Out.WriteLine( SBLine.ToString() );
>            SBLine = new StringBuilder();
>         }
>        )
>     ; 
> 
> I want to continue to the next rule that the parser can recognize. I 
> thought line should work, for example: consumeUntil(line) but that 
> causes an error.
> 
> I looked at the parser file from a previous version, and noticed 
> that error handling specified token sets. How do these map to the 
> rules? And how do I determine which ones to use?
> 
> The exception handling routine above works, but I'm not sure exactly 
> what I'm referencing with tokenSet_0_;
> 
> Thanks Again!


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list