[antlr-interest] Generated parser class too large to compile

Ron Hunter-Duvar ron.hunter-duvar at oracle.com
Thu Feb 25 16:35:19 PST 2010


Wow, thanks! I was under the mistaken impression that imported grammars 
were essentially included (maybe a past version of Antlr did this?), so 
I didn't think this would help.

Ron


Jim Idle wrote:
> Ron,
>
> All you need do is create a top level grammar file that imports the other parts of your SQL grammar which will then generate in separate classes. From my TSQL grammar:
>
> parser grammar tsql;
>
> options
> {
> 	// Produce a generic AST as output.
> 	//
> 	output		= AST;
>
> 	// Import the lexers token numbering scheme.
> 	//
> 	tokenVocab	= tsqllexer;
> }
>
>  
> // Import the grammar for the million SQL statements
> //
> import tsqlcommon, tsqlselect, tsqlalter, tsqlcreate, tsqlpermissions, tsqlcursors, tsqlmisc, tsqlmisc2, tsqldrop
>
> tokens { X; Y; }
>
> @parser::header{  }
>
> a : couple of base rules ;
>
> couple : ;
>
> // Rest are in the other grammars. This works for the tree walkers too.
>
>
> Jim
>
>   
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Ron Hunter-Duvar
>> Sent: Thursday, February 25, 2010 2:35 PM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] Generated parser class too large to compile
>>
>> Hi,
>>
>> I'm running into a problem with the Java parser class generated by
>> Antlr
>> 3.2 being too large to compile. I don't think there's anything wrong
>> with my grammar or with Antlr, it's simply the size and complexity of
>> the grammar. It's already 2,500 lines of code, 208 rules, and Antlr
>> generates 68,000 lines of output. This is just the parser grammar (the
>> lexer grammar is separate and isn't a problem), and I'm not done yet.
>> The problem is that Java is not an ideal language target for code
>> generation, given it's 64KB of bytecode per class limit (and various
>> other 64K limits), due to the JVM using 16 bit pointers
>> (http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.
>> html#88659).
>>
>>
>> I've been able to work around the problem with a poor man's
>> refactoring,
>> a Perl script that breaks out the one generated class into interfaces
>> for the constants (tokens, DFA initializations) and an abstract
>> superclass for the DFA nested classes and methods and stubs for all the
>> other methods. This is working, but as I continue I have to keep
>> refining it to do more refactoring. It's really a kludge, and only
>> works
>> by relying on the specific structure and formatting of the Antlr
>> output.
>>
>> I'm thinking that a more general solution would be to modify the code
>> generation to generate factored code. I've only looked briefly at it so
>> far, but since it's all driven by StringTemplate templates and already
>> accomodates multiple output languages, it shouldn't be too difficult to
>> adapt it. I would probably create a new back-end "language" such as
>> "FactoredJava", based on the Java templates. That would make switching
>> between the standard one and mine a simple grammar option change. Does
>> anyone see a problem with this plan? Any suggestions?
>>
>> The only other alternative I see is to switch to a back-end language
>> that doesn't have this limitation. But that creates quite a bit of
>> rework (replacing semantic predicates and action code, and the
>> subclasses of standard Antlr runtime classes that I've created to
>> customize the behaviour), as well as integration issues with all the
>> other Java code.
>>
>> Is there anything I'm missing here? Any Antlr options that would
>> significantly reduce the size of the generated code?
>>
>> Thanks,
>> Ron
>>
>> --
>> Ron Hunter-Duvar | Software Developer V | 403-272-6580
>> Oracle Service Engineering
>> Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5
>>
>> All opinions expressed here are mine, and do not necessarily represent
>> those of my employer.
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
>>     
>
>
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>   

-- 
Ron Hunter-Duvar | Software Developer V | 403-272-6580
Oracle Service Engineering
Gulf Canada Square 401 - 9th Avenue S.W., Calgary, AB, Canada T2P 3C5

All opinions expressed here are mine, and do not necessarily represent
those of my employer.



More information about the antlr-interest mailing list