[antlr-interest] ANTLRWorks 1.4.3: XYZParser.java:14: code too large (public static final String[] tokenNames = new String[] { ... } ; )

Sam Barnett-Cormack s.barnett-cormack at lancaster.ac.uk
Mon Sep 26 12:57:59 PDT 2011


The possibility of static nested classes had also occured to me as a way 
of breaking up the otherwise-huge static initialisation code being 
produced. For things like the followsets, it might be beneficial to look 
at the architecture of the generated code to see if there's any possible 
benefit to defining an abstract class these static nested classes could 
inherit from.

Another possibility would be removing these large static data structures 
from the source entirely, and instead emitting some extra files that 
define them and that are read in by the generated code. Java resource 
loading semantics mean that files located with the class files are easy 
to load at run-time. A readable format could even be used to make it 
easier for humans to understand the code (and supporting data) that are 
generated.

Sam

On 26/09/2011 20:36, Vlad wrote:
> Java .class file format has a specification limit such that any bytecode method must not be more than 64k bytes. If this limit is violated (normally only happens with auto-generated code) you'll see the compiler error you have.
>
> Although it is not very common knowledge, any java code of the form
>
> ... static final SomeType SOME_FIELD =<some exp>
>
> is equivalent to the following:
>
> a declaration of
>
> ... static final SomeType SOME_FIELD;
>
> combined with bytecode to compute<some exp>  and assign it to SOME_FIELD, performed inside a special method named '<clinit>' at class loading time. It is the same method that also collects anything you put in a static { } block. I had a look at the .java files you see generated from your grammer and there is absolutely a ton of public static finals that require such<clinit>  code: some 1300 FOLLOW_... BitSets, tokenNames, ruleNames, etc.
>
> All of those static field init bytecodes end up in<clinit>  and cause size overflow. It seems to me that hitting the 64K limit can happen for any reasonably large grammar ("large" defined not just in terms of token count, but also the number of rules, etc).
>
> To address this issue at the fundamental level, ANTLR need to alter its .java code emission strategy. Perhaps map rules to static methods of static nested classes instead of lumping everything into a single .class definition. Nested classes in Java are compiled into separate .class definitions, each with it own<clinit>.
>
> HTH,
> Vlad
>
> On Sep 26, 2011, at 1:06 PM, Udo Weik wrote:
>
>> Hello Terence,
>>
>>> Interesting. That's not that big.Only 162 strings should not merely be enough to blow out the 64k  static INIT method limit. hmm... perhaps the other arrays are the big as well.
>>> ter
>>
>> I just tried to delete 'static' but of course that doesn't work:
>> XYZParser.java:347: non-static variable tokenNames cannot be referenced from a static context
>> So the question is - any solution for that problem?
>>
>> Thanks and greetings
>> Udo
>>
>>
>>> On Sep 26, 2011, at 8:42 AM, Udo Weik wrote:
>>>
>>>> Hello Terence,
>>>>
>>>>> wow! how big is that grammar?
>>>>> Ter
>>>>
>>>> I'm trying to get the VHDL-grammar for the CSharp-target from
>>>> Mike Lodder working with Java:
>>>> http://www.antlr.org/grammar/1202750770887/vhdl.g
>>>>
>>>> Some first, very basic modifications see attachement.
>>>> First of all that grammar should work with ANTLRWorks 1.4.3.
>>>>
>>>> Many thanks for any support
>>>> Udo
>>>>
>>>>
>>>>> On Sep 26, 2011, at 6:50 AM, Udo Weik wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> the length of that line is 1647 chars (162 strings).
>>>>>> The grammar is an existing one. What can/must I do?
>>>>>>
>>>>>> [15:41:27] XYZParser.java:14: code too large
>>>>>> [15:41:27]     public static final String[] tokenNames = new String[] {
>>>>>> [15:41:27]                                  ^
>>>>>> [15:41:27] 1 error
>>>>>>
>>>>>>
>>>>>> Many thanks and greetings
>>>>>> Udo
>>>>>>
>>>>>>
>>>>>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>>>>>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>>>>
>>>>
>>>> <vhdl__UW1a.g>
>>>
>>
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list