[antlr-interest] help with error recovery

Wed Feb 17 11:21:46 PST 2010

Thank you Jim - I hadn't thought about the problem in that way.

- Joe

On Wed, Feb 17, 2010 at 11:44 AM "Jim Idle" <jimi at temporal-wave.com> wrote:
>Do this:
> 
> 1) Move those to real lexer tokens (though I understand this may just be an example)
> 2) Use predicates for real things
> 3) Eat and discard the rest
> 
> So:
> 
> foo
>   : ( (bar)=>bar | .)+ ->bar+
>   ;
> 
> Here I show the whole rule bar as the predicate, which can be expensive if the rule is complicated, so construct a rule that has the minimum token set to correctly predict bar, rather than the complete rule, if you have a complicated rule.
> 
> If you find that you must do this via error recovery and resync the input to something manually, then you want:
> 
> http://www.antlr.org/wiki/display/ANTLR3/Custom+Syntax+Error+Recovery
> 
> Jim
> 
> 
> 
> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Joe stelmach
> > Sent: Wednesday, February 17, 2010 7:59 AM
> > To: antlr-interest at antlr.org
> > Subject: [antlr-interest] help with error recovery
> > 
> > I'm looking for some help implementing a custom error recovery
> > strategy.
> > 
> > Consider the following grammar which accepts strings of the form
> > "ab--ab--cd--"..., and generates flat AST's of the form: GROUP["ab"]
> > GROUP["ab"] GROUP["cd"]...
> > 
> > grammar Test;
> > 
> > options {
> >   output=AST;
> > }
> > 
> > tokens {
> >   GROUP;
> > }
> > 
> > foo
> >   : (bar '--')+ -> bar+
> >   ;
> > 
> > bar
> >   : (('a' 'b') | ('c' 'd')) -> GROUP[$bar.text]
> >   ;
> > 
> > Now suppose we feed the parser the input string "ab--ac--cd--".  I
> > would like the resulting AST to look like: GROUP["ab"] GROUP["cd"]
> > corresponding to the first "ab" and the last "cd" of the input string.
> >  In other words, when the parser starts to match a bar rule but fails
> > (as it will when it encounters the first 'c' token in our example
> > input,) I'd like to scan past all tokens until the next '--' token,
> > and then tell the parser to back up to the state it was in just after
> > encountering the first 'b' token.
> > 
> > I'm able to over-ride what I think to be the appropriate methods of
> > BaseRecognizer, and I understand how to scan past and consume the
> > tokens I don't care about, but I'm unsure of how to direct the parser
> > back to the previous state (or if it's even possible.)
> > 
> > Any help would be appreciated.
> > 
> > - Joe
> > 
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> > email-address
> 
> 
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address