[antlr-interest] Generated parser class too large to compile

Jim Idle jimi at temporal-wave.com
Thu Feb 25 14:43:55 PST 2010


Ron,

All you need do is create a top level grammar file that imports the other parts of your SQL grammar which will then generate in separate classes. From my TSQL grammar:

parser grammar tsql;

options
{
	// Produce a generic AST as output.
	//
	output		= AST;

	// Import the lexers token numbering scheme.
	//
	tokenVocab	= tsqllexer;
}

 
// Import the grammar for the million SQL statements
//
import tsqlcommon, tsqlselect, tsqlalter, tsqlcreate, tsqlpermissions, tsqlcursors, tsqlmisc, tsqlmisc2, tsqldrop

tokens { X; Y; }

@parser::header{  }

a : couple of base rules ;

couple : ;

// Rest are in the other grammars. This works for the tree walkers too.


Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Ron Hunter-Duvar
> Sent: Thursday, February 25, 2010 2:35 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Generated parser class too large to compile
> 
> Hi,
> 
> I'm running into a problem with the Java parser class generated by
> Antlr
> 3.2 being too large to compile. I don't think there's anything wrong
> with my grammar or with Antlr, it's simply the size and complexity of
> the grammar. It's already 2,500 lines of code, 208 rules, and Antlr
> generates 68,000 lines of output. This is just the parser grammar (the
> lexer grammar is separate and isn't a problem), and I'm not done yet.
> The problem is that Java is not an ideal language target for code
> generation, given it's 64KB of bytecode per class limit (and various
> other 64K limits), due to the JVM using 16 bit pointers
> (http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.
> html#88659).
> 
> 
> I've been able to work around the problem with a poor man's
> refactoring,
> a Perl script that breaks out the one generated class into interfaces
> for the constants (tokens, DFA initializations) and an abstract
> superclass for the DFA nested classes and methods and stubs for all the
> other methods. This is working, but as I continue I have to keep
> refining it to do more refactoring. It's really a kludge, and only
> works
> by relying on the specific structure and formatting of the Antlr
> output.
> 
> I'm thinking that a more general solution would be to modify the code
> generation to generate factored code. I've only looked briefly at it so
> far, but since it's all driven by StringTemplate templates and already
> accomodates multiple output languages, it shouldn't be too difficult to
> adapt it. I would probably create a new back-end "language" such as
> "FactoredJava", based on the Java templates. That would make switching
> between the standard one and mine a simple grammar option change. Does
> anyone see a problem with this plan? Any suggestions?
> 
> The only other alternative I see is to switch to a back-end language
> that doesn't have this limitation. But that creates quite a bit of
> rework (replacing semantic predicates and action code, and the
> subclasses of standard Antlr runtime classes that I've created to
> customize the behaviour), as well as integration issues with all the
> other Java code.
> 
> Is there anything I'm missing here? Any Antlr options that would
> significantly reduce the size of the generated code?
> 
> Thanks,
> Ron
> 
> --
> Ron Hunter-Duvar | Software Developer V | 403-272-6580
> Oracle Service Engineering
> Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
> 
> All opinions expressed here are mine, and do not necessarily represent
> those of my employer.
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address





More information about the antlr-interest mailing list