[antlr-interest] ANTLR Semantic Predicate Check Exceeding 65535 Bytes Limit
Zachary Palmer
zep_antlr at bahj.com
Wed Oct 27 15:46:04 PDT 2010
Hello again. :)
I seem to have hit an interesting boundary; ANTLR has generated a method
which is more than 64K in size. This would be relatively unremarkable -
goodness knows that this has happened to other people - except that the
particular reason I have encountered seems to be somewhat strange. The
following is a method generated in a DFA that ANTLR produced for me:
public int specialStateTransition(int s, IntStream _input)
throws NoViableAltException {
TokenStream input = (TokenStream)_input;
int _s = s;
switch ( s ) {
case 0 :
int LA58_0 = input.LA(1);
int index58_0 = input.index();
input.rewind();
s = -1;
if ( (LA58_0==128) &&
((((configuration.getMetaprogramsSupported())&&(configuration.getCodeSplicingSupported()))||((configuration.getMetaAnnotationsSupported())&&(configuration.getCodeSplicingSupported()))||((configuration.getMetaAnnotationsSupported())&&(configuration.getCodeSplicingSupported()))||
/****** GREAT BIG SNIP ******/
((configuration.getMetaAnnotationsSupported())&&(configuration.getCodeSplicingSupported()))||((configuration.getMetaprogramsSupported())&&(configuration.getCodeSplicingSupported())))))
{s = 1;}
else if (
(LA58_0==ABSTRACT||LA58_0==CLASS||(LA58_0>=ENUM &&
LA58_0<=FINAL)||LA58_0==INTERFACE||LA58_0==NATIVE||(LA58_0>=PRIVATE &&
LA58_0<=PUBLIC)||(LA58_0>=STATIC &&
LA58_0<=STRICTFP)||LA58_0==SYNCHRONIZED||LA58_0==TRANSIENT||LA58_0==VOLATILE||LA58_0==SEMI||LA58_0==MONKEYS_AT)
) {s = 2;}
else if ( (LA58_0==METAPROGRAM_START) &&
((configuration.getMetaprogramsSupported()))) {s = 18;}
input.seek(index58_0);
if ( s>=0 ) return s;
break;
case 1 :
int LA58_1 = input.LA(1);
int index58_1 = input.index();
input.rewind();
s = -1;
if (
((synpred64_BsjAntlr()&&(configuration.getCodeSplicingSupported()))) )
{s = 19;}
else if (
((((configuration.getMetaAnnotationsSupported())&&(configuration.getCodeSplicingSupported()))||(configuration.getCodeSplicingSupported())||((configuration.getMetaprogramsSupported())&&(configuration.getCodeSplicingSupported()))||((configuration.getMetaAnnotationsSupported())&&(configuration.getCodeSplicingSupported()))||((configuration.getMetaAnnotationsSupported())&&(configuration.getCodeSplicingSupported()))||((configuration.getMetaAnnotationsSupported())&&(configuration.getCodeSplicingSupported()))||((configuration.getMetaAnnotationsSupported())&&(configuration.getCodeSplicingSupported()))))
) {s = 18;}
input.seek(index58_1);
if ( s>=0 ) return s;
break;
}
if (state.backtracking>0) {state.failed=true; return -1;}
NoViableAltException nvae =
new NoViableAltException(getDescription(), 58, _s, input);
error(nvae);
throw nvae;
}
Looks pretty normal... except for the astonishingly common constructs
like
"(((configuration.getMetaprogramsSupported())&&(configuration.getCodeSplicingSupported()))".
In fact, each of these constructs is a semantic predicate applied to a
rule I use fairly often: meta-annotations. These are similar to normal
annotations in that they can appear in the modifiers clause of any Java
declaration (or, additionally, as a prefix to any Java statement). I
guarded them with this semantic predicate to allow me to turn my parser
into a normal Java parser by fiddling with the configuration; I observed
this technique used in the Java 1.5 parser on the ANTLRv3 site and
thought it quite sensible.
The catch is that the comment /****** GREAT BIG SNIP ******/ above is
hiding more than 200,000 characters of code. This same condition is
repeated here an enormous number of times. I've looked throughout my
parser and discovered that this same pattern appears in many other
places as well. In some cases, a generated ANTLR syntactic predicate is
also called in this fashion, such as in:
if (
(((synpred188_BsjAntlr()&&(configuration.getCodeSplicingSupported()))||(synpred188_BsjAntlr()&&(configuration.getCodeSplicingSupported()))||(synpred188_BsjAntlr()&&(configuration.getCodeSplicingSupported()))))
) {
alt150=1;
}
Does anyone have any idea what I've done to so infuriate the gods? I'll
probably be using a regular expression to seek through the code and
eliminate the most egregious of cases -- I'm already using an ANT script
to add @SuppressWarnings annotations to the classes, so it's not that
far out of my build process -- but any hint as to how I did this or what
I could to do solve it correctly would be quite appreciated.
Thanks!
- Zach
More information about the antlr-interest
mailing list