[antlr-interest] Generated parser class too large to compile
Ron Hunter-Duvar
ron.hunter-duvar at oracle.com
Thu Feb 25 14:35:29 PST 2010
Hi,
I'm running into a problem with the Java parser class generated by Antlr
3.2 being too large to compile. I don't think there's anything wrong
with my grammar or with Antlr, it's simply the size and complexity of
the grammar. It's already 2,500 lines of code, 208 rules, and Antlr
generates 68,000 lines of output. This is just the parser grammar (the
lexer grammar is separate and isn't a problem), and I'm not done yet.
The problem is that Java is not an ideal language target for code
generation, given it's 64KB of bytecode per class limit (and various
other 64K limits), due to the JVM using 16 bit pointers
(http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html#88659).
I've been able to work around the problem with a poor man's refactoring,
a Perl script that breaks out the one generated class into interfaces
for the constants (tokens, DFA initializations) and an abstract
superclass for the DFA nested classes and methods and stubs for all the
other methods. This is working, but as I continue I have to keep
refining it to do more refactoring. It's really a kludge, and only works
by relying on the specific structure and formatting of the Antlr output.
I'm thinking that a more general solution would be to modify the code
generation to generate factored code. I've only looked briefly at it so
far, but since it's all driven by StringTemplate templates and already
accomodates multiple output languages, it shouldn't be too difficult to
adapt it. I would probably create a new back-end "language" such as
"FactoredJava", based on the Java templates. That would make switching
between the standard one and mine a simple grammar option change. Does
anyone see a problem with this plan? Any suggestions?
The only other alternative I see is to switch to a back-end language
that doesn't have this limitation. But that creates quite a bit of
rework (replacing semantic predicates and action code, and the
subclasses of standard Antlr runtime classes that I've created to
customize the behaviour), as well as integration issues with all the
other Java code.
Is there anything I'm missing here? Any Antlr options that would
significantly reduce the size of the generated code?
Thanks,
Ron
--
Ron Hunter-Duvar | Software Developer V | 403-272-6580
Oracle Service Engineering
Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
All opinions expressed here are mine, and do not necessarily represent
those of my employer.
More information about the antlr-interest
mailing list