[antlr-interest] How to force error recovery?

Edson Tirelli tirelli at post.com
Tue Oct 5 09:50:18 PDT 2010


   Jim,

   The actual situation is that we use "soft keywords" in our grammar,
i.e., our keywords are ID tokens from the lexer and we check ID.text
to make sure the ID is a keyword. So, checking if ID is part of the
follow set is not enough to guarantee the statement rule will succeed.

statement
    : rule
    | query
    | ...
    ;

rule : rule_key ... ;

rule_key
	:	{helper.validateIdentifierKey(DroolsSoftKeywords.RULE)}?=>  id=ID
		->	VK_RULE[$id]
	;

    The predicate above just checks input.LT(1) text to make sure it
is the keyword.

    So, what I need is a way to cal the statement rule again on an
exception, i.e. continuing in the statement* loop. My grammar is here:

http://anonsvn.jboss.org/repos/labs/labs/jbossrules/branches/etirelli/drools-compiler/src/main/resources/org/drools/lang/DRL.g

    Thanks,
       Edson



2010/10/5 Jim Idle <jimi at temporal-wave.com>:
> If you are using too many syntactic predicates, then you can end up with a
> case that you cannot recover from without actually trying to parse
> (backtrack mode). However, this usually means that your grammar needs more
> left factoring than it has at the moment.
>
> However, that said, the followset should only include tokens that can start
> one of the alts in your statement rule. Hence, having found such a token,
> your statement rule should handle it and if not, it should throw the
> exception and allow you to try again. You may just need to apply the
> technique within subrules. There are also cases where the input is so out of
> whack that there is nothing that can be recovered.
>
> So, without seeing your grammar, I can’t really guide you to be honest, but
> now you know how to do this in general, you should find that you can work
> through the specific cases and either re-jig the grammar, or write very
> specific recovery routines for very specific situations. First rule of thumb
> is that if you have predicates with more than one or two tokens, then your
> grammar is very likely in need of some work.
>
> Jim
>
>> -----Original Message-----
>> From: ed.tirelli at gmail.com [mailto:ed.tirelli at gmail.com] On Behalf Of
> Edson
>> Tirelli
>> Sent: Tuesday, October 05, 2010 9:07 AM
>> To: Jim Idle
>> Cc: antlr-interest at antlr.org
>> Subject: Re: [antlr-interest] How to force error recovery?
>>
>>    Hi Jim,
>>
>>    Yes, I found the wiki after sending the e-mail yesterday. Thanks for
> taking
>> the time on writing that as it was really helpful.
>>
>>    Now, continuing on the subject, I need to go a step further for my use
> case.
>> Just so you understand, in my case, due to syntactic predicates, even if
> the
>> next token is on the follow set, the "statement" rule can still fail. So,
> the
>> question is: how to stay in the loop, skipping/deleting tokens, until it
> either
>> succeeds in parsing the rest of the input or EOF is found?
>>
>> compilationUnit
>>    : resync (statement resync)* EOF
>>    ;
>>
>>    Thanks,
>>      Edson
>>
>>
>>
>> 2010/10/5 Jim Idle <jimi at temporal-wave.com>:
>> > Please read the article in the wiki on error recovery methods. You can
>> > see there how to keep a parse loop going instead of it breaking out.
>> > You can also see a real world example if you download the source code
>> > for the JavaFX compiler, as I wrote the error recovery article after
> writing
>> that parser.
>> >
>> >
>> http://www.antlr.org/wiki/display/ANTLR3/Custom+Syntax+Error+Recovery
>> >
>> > Jim
>> >
>> >> -----Original Message-----
>> >> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> >> bounces at antlr.org] On Behalf Of Edson Tirelli
>> >> Sent: Monday, October 04, 2010 3:27 PM
>> >> To: antlr-interest at antlr.org
>> >> Subject: [antlr-interest] How to force error recovery?
>> >>
>> >>    Hi all,
>> >>
>> >>    Look at this simple grammar:
>> >>
>> >> grammar testGrammar;
>> >> options {
>> >>   output=AST;
>> >> }
>> >>
>> >> compilationUnit
>> >>   : statement* EOF
>> >>   ;
>> >>
>> >> statement
>> >>   :   A^
>> >>   |   B^ C
>> >>   ;
>> >>
>> >> A   :   'a';
>> >>
>> >> B   : 'b';
>> >>
>> >> C   :   'c';
>> >>
>> >> WS  :   ( ' '
>> >>         | '\t'
>> >>         | '\r'
>> >>         | '\n'
>> >>         ) {$channel=HIDDEN;}
>> >>     ;
>> >>
>> >>
>> >>     Using the above grammar, it will successfully parse an input like:
>> >>
>> >> a b c a
>> >>
>> >>     Now, if the input is:
>> >>
>> >> a c a
>> >>
>> >>     The generated parser will parse "a", and will fail at "c", as it
>> >> is
>> > not a valid
>> >> statement. Reading the error recovery chapter on the ANTLR book, I
>> >> would imagine ANTLR would delete/skip the "c" token and try to
>> >> recover, successfully parsing the second "a", as that is a valid
> statement
>> again.
>> > But it is
>> >> not working like this. It is aborting the parsing with an error at "c".
>> >>
>> >>     Question: how do I force it to recover from the error and
>> >> continue
>> > parsing?
>> >>
>> >>     The actual scenario is that the parser I am working on is used by
>> >> an
>> > IDE
>> >> environment (eclipse), so we need it to continue parsing and
>> >> presenting
>> > the
>> >> users with all the errors found in the file, not just the first one.
>> >> The
>> > error
>> >> recovery seems to work on some rules, but not on the top rule
>> >> (compilationUnit).
>> >>
>> >>     Thanks,
>> >>        Edson
>> >>
>> >> --
>> >>   Edson Tirelli
>> >>   JBoss Drools Core Development
>> >>   JBoss by Red Hat @ www.jboss.com
>> >>
>> >> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> >> Unsubscribe:
>> >> http://www.antlr.org/mailman/options/antlr-interest/your-
>> >> email-address
>> >
>> >
>> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> > Unsubscribe:
>> > http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>> >
>>
>>
>>
>> --
>>   Edson Tirelli
>>   JBoss Drools Core Development
>>   JBoss by Red Hat @ www.jboss.com
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>



-- 
  Edson Tirelli
  JBoss Drools Core Development
  JBoss by Red Hat @ www.jboss.com


More information about the antlr-interest mailing list