[antlr-interest] error handling v3 style
Terence Parr
parrt at cs.usfca.edu
Sun Dec 11 10:32:42 PST 2005
For v3, I'm going to allow the usual spec of exceptions on the end of
rules:
a : A B
| C D
;
exception [label]
catch [exceptionType exceptionVariable]
{ action }
catch ...
catch ...
For control freaks, the templates for code gen can be altered
trivially (and from within the grammar file). Now, wouldn't it be
interesting if we had "error productions" sort of like yacc tries to
fake. The idea is to provide error alts that match common
ungrammatical sentences:
a : A B
| C D
/ B A {error("don't you mean A B?"); recover();}
/ A {error("don't you want a B with that?"); recover();}
;
where (I've randomly used / to mean error alt but we probably want
something better and more obvious). This means if any of the first
two alts fail, then rewind and try to match one of the last two (with
full backtracking turned on as the productions will be highly
ambiguous often).
Now, that only matches what the erroneous productions look like and
you have to manually do the recovery step. Should we allow you to
specify the recovery language? This would be an interesting feature
that let you recover with a grammar fragment not an action. For
example, you might want to skip until you see the outermost '}' of a
method. You could do this with
method
: type ID ...
;
exception
catch[RecognitionException e]
( {level>0}? ('}' {level--;} | .) )*
So instead of an action, you provide a grammar fragment (here a tough
one with context-sensitive matching).
Do we need a combination of matching error sequences and then
sophisticated error recovery strategies?
Is that interesting to any of you folks out there building systems?
Does anybody use the paraphrase feature from v2?
ID
options {
paraphrase = "an identifier";
}
: ('a'..'z'|'A'..'Z'|'_')
('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
;
It says "an identifier" instead of ID in error messages.
Ter
More information about the antlr-interest
mailing list