[antlr-interest] Early exit exception in positive closures
Andreas Meyer
andreas.meyer at smartshift.de
Tue Mar 10 04:46:23 PDT 2009
Hi!
I have a question regarding predicates and positive closures like
({pred}? rule)+. I want to use the predicate in order to disambiguate
things like :
(identifier|keyword)+ rule_with_keywords?
Previously, I solved this by manually assembling a FIRST-set of
rule_with_keywords and putting this into a negated syntactic predicate:
((~(KW123|KW456))=>(identifier|keyword))+ rule_with_keywords?
Now, I asked myself: why not use {false}? as predicate, instead? It's
only used when the actual input contains an ambiguity, so it would (at
runtime) prefer to exit the subrule, which is exactly what I want.
However, it seems that the predicate also serves as something like a
check _before_ even doing the first iteration. So, for an input like
"KW123 KW123" the generated parser complains about an "early exit
exception" when I use {false}?.
Would it make sense to add an option to ANTLR like
"do_not_check_for_early_exit"? This would greatly simplify my grammar,
as rules like these occur very often. Sure, this also would work:
(identifier|keyword) ({false}? (identifier|keyword))* rule_with_keywords?
but of course it's more redundant.
Greetings,
Andreas Meyer
--------------
grammar Lang;
options
{
output=AST;
}
@members {
public boolean match( String rulename ) {
boolean didMatch = false;
int mark = input.mark();
try {
java.lang.reflect.Method m = this.getClass().getMethod(
rulename );
state.backtracking ++;
m.invoke( this );
didMatch = true;
}
catch( NoSuchMethodException e )
{
e.printStackTrace();
didMatch = false;
}
catch( IllegalAccessException e )
{
e.printStackTrace();
didMatch = false;
}
catch (java.lang.reflect.InvocationTargetException e) {
if( e.getCause() instanceof RecognitionException )
didMatch = false;
}
state.backtracking --;
input.rewind (mark);
return didMatch;
}
}
start
: (stmt '.')+ EOF
;
// this works, because at the first occurrence of atom, the "predicate"
is not evaluated ...
//stmt : KW_3 atom ( {false}? atom)* option?
// ;
// however, here, the generated code tries to disambiguate, although
there is no need to,
// because '+' tells that at least one atom is wanted
stmt
: KW_3 ( {false}? atom)+ option?
;
//stmt : KW_3 ( {!match("option")}? atom)+ option?
// ;
option
: KW_1^ KW_2
| KW_1^
| KW_2^
;
atom
: ID
| keyword
;
keyword
: KW_1
| KW_2
| KW_3
| KW_4
;
KW_1 : 'kw1';
KW_2 : 'kw2';
KW_3 : 'kw3';
KW_4 : 'kw4';
ID: ('a'..'z' | 'A'..'Z' )+ ;
WS : (' '|'\n'|'\r'|'\t') {$channel=HIDDEN;} ;
More information about the antlr-interest
mailing list