[antlr-interest] Re: Custom error recovery

Wed Jul 27 18:14:42 PDT 2005

First off, thank you Bryan and Alexey for your replies -- I'm slowing
working this stuff out now.

 > Do you really want to do this, or do you want to have only certain
 > productions be the "stopping point" for error recovery?
 >
 > For example, do you want to "skip to semicolon", or "recover at
 > <statement>"?  The first is somewhat complex in ANTLR, the second is
 > very straightforward using options in the parser section to disable
 > default error recovery, and in particular rules to enable default
 > error recovery.

I'm not sure I understand the difference between the two alternatives
above. What I would like it to do is something like the following..
Suppose my source contains this:

int a;
print "hello";
int var1, var2, var3;
int x = 1+1;
nondatatype whatever not valid tokens;
int c;
int more, vars;

So when the parser runs into line 4 in the source snippet it will 
generate some sort of syntax error and I would like it to "just ignore 
the rest of the line, ie skip until it finds the SEMICOLON token/char". 
Of course, this is only within functions but the language I parse doesnt 
have anything but include directives outside of functions so this is a 
very small problem.

Earlier this morning I looked at the parser source generated by ANTLR 
but to be honest, I don't really understand that much of it. I 
understand all the small parts, but I have difficulties understanding 
the wider perspective -- how all the pieces work together etc.

In particular I'm not really sure I understand the default syntax error
recovery strategy used by ANTLR. Maybe someone could explain it in plain
english, that would be great.

---

Also, while thinking about this recovery stuff I came up with another
fancy feature that I would like to implement if possible. I'm building 
an Eclipse plugin and in Eclipse syntax errors are typically underlined 
by a red squiggly line. For instance if I type this in Java:

int var blah;

Then Eclipse would recognize that "blah" isn't really expected, and it
would underline the "blah" part with an error saying something like 
"this was an unexpected token".

So, is it possible to have ANTLR provide the information required to
implement this? I tried earlier today to do this (this was a little bit 
of a longshot I guess):

public void reportError(RecognitionException e) {
   try {
     Token errorToken = LT(0);
   } catch (TokenStreamException e) { /* ignored for now */ }
   editor.displayError(e.getMessage(), e.getLine(),
                       e.getColumn(), errorToken.getLength());
}

The problem with this approach was that the Token class apparently does
not have a getLength() method -- thus I didn't get this experiment very
far.

I'm not really sure what token LT(0) returns so I don't know if it even
could have worked had there been a getLength() method. The idea was of
course to have Eclipse place its red underlining starting at the column
(on the specified line) and extending for length-chars ahead.

Have anyone else does something like this before? i.e. investigating the
length of the token or word of whatever that appeared instead of the
expected token?

Regards,
martin