[antlr-interest] Early exit exception in positive closures

Andreas Meyer andreas.meyer at smartshift.de
Tue Mar 10 04:46:23 PDT 2009


Hi!

I have a question regarding predicates and positive closures like 
({pred}? rule)+. I want to use the predicate in order to disambiguate 
things like :

  (identifier|keyword)+ rule_with_keywords?

Previously, I solved this by manually assembling a FIRST-set of 
rule_with_keywords and putting this into a negated syntactic predicate:

  ((~(KW123|KW456))=>(identifier|keyword))+ rule_with_keywords?

Now, I asked myself: why not use {false}? as predicate, instead? It's 
only used when the actual input contains an ambiguity, so it would (at 
runtime) prefer to exit the subrule, which is exactly what I want. 
However, it seems that the predicate also serves as something like a 
check _before_ even doing the first iteration. So, for an input like 
"KW123 KW123" the generated parser complains about an "early exit 
exception" when I use {false}?.

Would it make sense to add an option to ANTLR like 
"do_not_check_for_early_exit"? This would greatly simplify my grammar, 
as rules like these occur very often. Sure, this also would work:

  (identifier|keyword) ({false}? (identifier|keyword))* rule_with_keywords?

but of course it's more redundant.

Greetings,
Andreas Meyer


--------------

grammar Lang;

options
{
output=AST;
}

@members {
    public boolean match( String rulename ) {
        boolean didMatch = false;
        int mark = input.mark();
        try {
            java.lang.reflect.Method m = this.getClass().getMethod( 
rulename );
            state.backtracking ++;
            m.invoke( this );
            didMatch = true;
        }
        catch( NoSuchMethodException e )
        {
            e.printStackTrace();
            didMatch = false;
        }
        catch( IllegalAccessException e )
        {
            e.printStackTrace();
            didMatch = false;
        }
        catch (java.lang.reflect.InvocationTargetException e) {
            if( e.getCause() instanceof RecognitionException )
                didMatch = false;
        }
        state.backtracking --;
        input.rewind (mark);
        return didMatch;   
    }
}

start
    : (stmt '.')+ EOF
    ;


// this works, because at the first occurrence of atom, the "predicate" 
is not evaluated ...
//stmt    : KW_3 atom ( {false}? atom)* option?
//    ;

// however, here, the  generated code tries to disambiguate, although 
there is no need to,
// because '+' tells that at least one atom is wanted
stmt
    : KW_3 ( {false}? atom)+ option?
    ;

//stmt    : KW_3 ( {!match("option")}? atom)+ option?
//    ;


option
    : KW_1^ KW_2
    | KW_1^
    | KW_2^
    ;    


atom
    : ID
    | keyword
    ;
   

keyword
    : KW_1
    | KW_2
    | KW_3
    | KW_4
    ;

KW_1 : 'kw1';
KW_2 : 'kw2';
KW_3 : 'kw3';
KW_4 : 'kw4';   

ID: ('a'..'z' | 'A'..'Z' )+ ;
WS : (' '|'\n'|'\r'|'\t') {$channel=HIDDEN;} ;


More information about the antlr-interest mailing list