[antlr-interest] Error recovery during list assembly

Sat Jul 19 18:25:30 PDT 2008

At 09:58 20/07/2008, Brent Yates wrote:
>For the following rule:
>
>list
>     :   BEGIN item* END
>     ;
>
>I would like to be able to recovery from tokens which are not 
>valid items and not the END literal in such a way that the item 
>list assembly keeps going.  From inspection of the currently 
>generated code (3.1b2, C target) the parser will check to see if 
>the next token matches one of the expected tokens for item.  If 
>it does, it calls the item rule, if not, it drops out of the item 
>loop and tries to match the END literal.  If the end match fails 
>then a normal single token delete/insert recovery is tried.  The 
>parser is expecting the END literal though, not more items, and I 
>don't see a way to get back into the items loop.

This is just off the cuff; I'm not entirely sure if it'll work or 
if it'll do what you want, but it could be worth trying :)

list
   :  BEGIN
      ( item
      | (~END) => . { /* report an error */ }
      )* END
   ;

In theory, this will produce one "error" for each token that can't 
be interpreted as an item, and will exit the loop only when it 
sees the END.

If 'item' is particularly long or contains loops, you might need 
to add a synpred to it as well to let it backtrack and generate an 
error:

list
   :  BEGIN
      ( (item) => item
      | (~END) => . { /* report an error */ }
      )* END
   ;

Actually, you can probably even get away with just this, now that 
I think about it:
   list : BEGIN ( item | ~END { /* error */ } )* END;