[antlr-interest] How to force error recovery?

Jim Idle jimi at temporal-wave.com
Tue Oct 5 09:53:13 PDT 2010


In that case, all you need is to add a  check for ID in the recovery method
and then check that the text is one of the accepted soft keywords. If it is
not, then consume it, if it is, then you have reached a valid recovery
point. It just means that the recovery method will be specific to that point
in that rule.

Jim

> -----Original Message-----
> From: ed.tirelli at gmail.com [mailto:ed.tirelli at gmail.com] On Behalf Of
Edson
> Tirelli
> Sent: Tuesday, October 05, 2010 9:50 AM
> To: Jim Idle
> Cc: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] How to force error recovery?
> 
>    Jim,
> 
>    The actual situation is that we use "soft keywords" in our grammar,
i.e., our
> keywords are ID tokens from the lexer and we check ID.text to make sure
> the ID is a keyword. So, checking if ID is part of the follow set is not
enough to
> guarantee the statement rule will succeed.
> 
> statement
>     : rule
>     | query
>     | ...
>     ;
> 
> rule : rule_key ... ;
> 
> rule_key
> 	:
> 	{helper.validateIdentifierKey(DroolsSoftKeywords.RULE)}?=>  id=ID
> 		->	VK_RULE[$id]
> 	;
> 
>     The predicate above just checks input.LT(1) text to make sure it is
the
> keyword.
> 
>     So, what I need is a way to cal the statement rule again on an
exception,
> i.e. continuing in the statement* loop. My grammar is here:
> 
>
http://anonsvn.jboss.org/repos/labs/labs/jbossrules/branches/etirelli/drool
> s-compiler/src/main/resources/org/drools/lang/DRL.g
> 
>     Thanks,
>        Edson
> 
> 
> 
> 2010/10/5 Jim Idle <jimi at temporal-wave.com>:
> > If you are using too many syntactic predicates, then you can end up
> > with a case that you cannot recover from without actually trying to
> > parse (backtrack mode). However, this usually means that your grammar
> > needs more left factoring than it has at the moment.
> >
> > However, that said, the followset should only include tokens that can
> > start one of the alts in your statement rule. Hence, having found such
> > a token, your statement rule should handle it and if not, it should
> > throw the exception and allow you to try again. You may just need to
> > apply the technique within subrules. There are also cases where the
> > input is so out of whack that there is nothing that can be recovered.
> >
> > So, without seeing your grammar, I can’t really guide you to be
> > honest, but now you know how to do this in general, you should find
> > that you can work through the specific cases and either re-jig the
> > grammar, or write very specific recovery routines for very specific
> > situations. First rule of thumb is that if you have predicates with
> > more than one or two tokens, then your grammar is very likely in need of
> some work.
> >
> > Jim
> >
> >> -----Original Message-----
> >> From: ed.tirelli at gmail.com [mailto:ed.tirelli at gmail.com] On Behalf Of
> > Edson
> >> Tirelli
> >> Sent: Tuesday, October 05, 2010 9:07 AM
> >> To: Jim Idle
> >> Cc: antlr-interest at antlr.org
> >> Subject: Re: [antlr-interest] How to force error recovery?
> >>
> >>    Hi Jim,
> >>
> >>    Yes, I found the wiki after sending the e-mail yesterday. Thanks
> >> for
> > taking
> >> the time on writing that as it was really helpful.
> >>
> >>    Now, continuing on the subject, I need to go a step further for my
> >> use
> > case.
> >> Just so you understand, in my case, due to syntactic predicates, even
> >> if
> > the
> >> next token is on the follow set, the "statement" rule can still fail.
> >> So,
> > the
> >> question is: how to stay in the loop, skipping/deleting tokens, until
> >> it
> > either
> >> succeeds in parsing the rest of the input or EOF is found?
> >>
> >> compilationUnit
> >>    : resync (statement resync)* EOF
> >>    ;
> >>
> >>    Thanks,
> >>      Edson
> >>
> >>
> >>
> >> 2010/10/5 Jim Idle <jimi at temporal-wave.com>:
> >> > Please read the article in the wiki on error recovery methods. You
> >> > can see there how to keep a parse loop going instead of it breaking
out.
> >> > You can also see a real world example if you download the source
> >> > code for the JavaFX compiler, as I wrote the error recovery article
> >> > after
> > writing
> >> that parser.
> >> >
> >> >
> >>
> http://www.antlr.org/wiki/display/ANTLR3/Custom+Syntax+Error+Recovery
> >> >
> >> > Jim
> >> >
> >> >> -----Original Message-----
> >> >> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> >> >> bounces at antlr.org] On Behalf Of Edson Tirelli
> >> >> Sent: Monday, October 04, 2010 3:27 PM
> >> >> To: antlr-interest at antlr.org
> >> >> Subject: [antlr-interest] How to force error recovery?
> >> >>
> >> >>    Hi all,
> >> >>
> >> >>    Look at this simple grammar:
> >> >>
> >> >> grammar testGrammar;
> >> >> options {
> >> >>   output=AST;
> >> >> }
> >> >>
> >> >> compilationUnit
> >> >>   : statement* EOF
> >> >>   ;
> >> >>
> >> >> statement
> >> >>   :   A^
> >> >>   |   B^ C
> >> >>   ;
> >> >>
> >> >> A   :   'a';
> >> >>
> >> >> B   : 'b';
> >> >>
> >> >> C   :   'c';
> >> >>
> >> >> WS  :   ( ' '
> >> >>         | '\t'
> >> >>         | '\r'
> >> >>         | '\n'
> >> >>         ) {$channel=HIDDEN;}
> >> >>     ;
> >> >>
> >> >>
> >> >>     Using the above grammar, it will successfully parse an input
like:
> >> >>
> >> >> a b c a
> >> >>
> >> >>     Now, if the input is:
> >> >>
> >> >> a c a
> >> >>
> >> >>     The generated parser will parse "a", and will fail at "c", as
> >> >> it is
> >> > not a valid
> >> >> statement. Reading the error recovery chapter on the ANTLR book, I
> >> >> would imagine ANTLR would delete/skip the "c" token and try to
> >> >> recover, successfully parsing the second "a", as that is a valid
> > statement
> >> again.
> >> > But it is
> >> >> not working like this. It is aborting the parsing with an error at
"c".
> >> >>
> >> >>     Question: how do I force it to recover from the error and
> >> >> continue
> >> > parsing?
> >> >>
> >> >>     The actual scenario is that the parser I am working on is used
> >> >> by an
> >> > IDE
> >> >> environment (eclipse), so we need it to continue parsing and
> >> >> presenting
> >> > the
> >> >> users with all the errors found in the file, not just the first one.
> >> >> The
> >> > error
> >> >> recovery seems to work on some rules, but not on the top rule
> >> >> (compilationUnit).
> >> >>
> >> >>     Thanks,
> >> >>        Edson
> >> >>
> >> >> --
> >> >>   Edson Tirelli
> >> >>   JBoss Drools Core Development
> >> >>   JBoss by Red Hat @ www.jboss.com
> >> >>
> >> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> >> Unsubscribe:
> >> >> http://www.antlr.org/mailman/options/antlr-interest/your-
> >> >> email-address
> >> >
> >> >
> >> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> >> > Unsubscribe:
> >> > http://www.antlr.org/mailman/options/antlr-interest/your-email-addr
> >> > ess
> >> >
> >>
> >>
> >>
> >> --
> >>   Edson Tirelli
> >>   JBoss Drools Core Development
> >>   JBoss by Red Hat @ www.jboss.com
> >
> >
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe:
> > http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> >
> 
> 
> 
> --
>   Edson Tirelli
>   JBoss Drools Core Development
>   JBoss by Red Hat @ www.jboss.com



More information about the antlr-interest mailing list