[antlr-interest] problem about "the code for the static initializer is exceeding the 65535 bytes limit"
Jim Idle
jimi at temporal-wave.com
Wed Aug 15 20:25:19 PDT 2012
Thanks Kyle.
BTW guys, you might not want to publish your grammars to the world, but If you want to send them to me privately I will give you a few pointers for free. I have even been known to accept paid gigs, though that does not seem to have happened for a while in this economy ;$
Jim
On Aug 15, 2012, at 6:04 PM, Kyle Ferrio <kferrio at gmail.com> wrote:
> Hi Zhaohui,
>
> You already know that you've discovered a theme which evokes some passion
> in the ANTLR community.
>
> There is a *lot* of wisdom in Jim Idles's suggestions. Each one could be a
> whole lecture. If you take a class in compiler construction (or go back to
> your notes, if you already had the class) you will see this up close.
>
> My version is "preserve information; defer decisions as long as possible;
> and make every decision as simple as possible." If you do these things,
> your language will be easy to maintain and extend. And if you have users
> for any length of time, these characteristics are probably high on your
> list. Hopefully you were not given a pathelogical language spec.
>
> Good luck!
> On Aug 15, 2012 5:43 PM, "Zhaohui Yang" <yezonghui at gmail.com> wrote:
>
>> sounds promising :)
>>
>> We have written a program to separate those constants into several inner
>> classes, solves for now.
>>
>> Yours is definitely better:)
>> 在 2012-8-16 上午1:13,"Francis ANDRE" <francis.andre.kampbell at orange.fr>写道:
>>
>>> Le 15/08/2012 16:17, Zhaohui Yang a écrit :
>>>
>>> It's great someone is already trying a fix. I'd be glad to test your fix
>>> when it's out.
>>>
>>> Would you please introduce a bit what kind of fix is that? Is it for
>>> ANTLRWorks or ANTLR tool, is it a command line option for seperating
>> FOLLOW
>>> set or supressing that, or something else?
>>>
>>> The 64K syndrone is a pure Java problem due to the constraint that the
>> JVM
>>> does not support static initializer greater than 64K -- shame on it --.
>>> Thus if you look to the generated lexer and parser, you will see
>> certainly
>>> a lot of DFA classes, each of them having some static initializer values.
>>> The point is that the sum of the static initializer of all those DFAs is
>>> greater than 64K while the static initialization of each DFA is somewhat
>>> small or in most of case les than 64K. Thus, one solution is to extract
>> all
>>> those DFAs classes and put them outside the lexer or the parser in fixed
>>> directories like the following pattern:
>>>
>>> Let <grammar> the directory of the grammar to generate, then all the
>>> generated DFAs will go in
>>>
>>> for the lexer's DFAs: package <grammar>.lexer;
>>> for the parser's DAFs: package <grammar>.parser;
>>>
>>> and the reference of all those DFAs will be
>>> in the lexer: import <grammar>.lexer.*;
>>> in the parser import <grammar>.parser.*;
>>>
>>> But hold on, the fix has to be approved by Terr and I did not yet submit
>>> it. It need to pass all unit tests of the ANTLR3.4 and I am working on
>>> it... there is a real challenge getting the parser/lexer compiled for
>> java
>>> code generated without a package...; and all those unit tests are
>> producing
>>> java parser/lexer at the top level directory.
>>>
>>>
>>> 2012/8/15 Francis ANDRE <francis.andre.kampbell at orange.fr>
>>>
>>>> Hi Zhaohui
>>>>
>>>> I am currently working on fixing this issues with antlr3.4... Once I
>> will
>>>> have a proper patch, would you be interested in testing it??
>>>>
>>>> FA
>>>> Le 14/08/2012 18:05, Zhaohui Yang a écrit :
>>>>
>>>> Hi,
>>>>>
>>>>> Here we have a big grammar and the generated parser.java got a
>>>>> compilation
>>>>> : "the code for the static initializer is exceeding the 65535 bytes
>>>>> limit".
>>>>>
>>>>> I've searched the net for a while and found that is a widely known
>> limit
>>>>> in
>>>>> JVM or Javac compiler, and not yet has an option to change it higher.
>>>>>
>>>>> On the ANTLR side, I found 2 solutions proposed by others, but neither
>> of
>>>>> them is totally satisfying:
>>>>>
>>>>> 1. Seperate the big grammar into 2 *.g files, import one from the
>> other.
>>>>> Yes, this removes the compilation error with genereated Java. But
>>>>> ANTLRWorks does not support imported grammar well. E.g., I can not
>>>>> interpret a rule in the imported grammar, it's simply not in the rule
>>>>> list
>>>>> for interpreting. And gunit always fail with rules defined in imported
>>>>> grammar.
>>>>>
>>>>> 2. Modify the generated Java source, seperate the "FOLLOW_xxx_in_yyy"
>>>>> constants into several static classes and change references to them
>>>>> accordingly.
>>>>> This is proposed here -
>> http://www.antlr.org/pipermail/antlr-interest/2009-November/036608.html.
>>>>> The author of the post actually has a solution into ANTLR source code
>>>>> (some
>>>>> string template). But I can't find the attachment he referred to. And
>>>>> that's in 2009, I suspect the fix could be incompatible with current
>>>>> ANTLR
>>>>> version.
>>>>> Without this fix we have to do the modificaiton manually or write a
>>>>> script for that. The script is not that easy.
>>>>>
>>>>> And we found a 3rd solution by ourself, that also involve changing the
>>>>> generated Java:
>>>>>
>>>>> 3. Remove those FOLLOW_... constant completely, and replace the
>>>>> references
>>>>> with "null".
>>>>> Surprisingly this works, just no error recovery after this, not a
>>>>> problem for us. But we really worry this is unsafe, since it's not
>>>>> documented anywhere.
>>>>>
>>>>> After all, we're looking for any other solution that is easier to
>> apply,
>>>>> asumming we'll be constantly changing the grammar and recompile the
>>>>> parser.
>>>>>
>>>>> Maybe there is a way to get ANTLRWorks and gunit play well with
>>>>> imported
>>>>> grammar?
>>>>> Maybe there is already a commandline option for antlr Tool, that can
>>>>> genereate FOLLOW_... constants in seperate classes?
>>>>> Maybe there is already a commandline option for antlr Tool, that can
>>>>> supress FOLLOW_... constants code generation?
>>>
>>>
>>> --
>>> Regards,
>>>
>>> Yang, Zhaohui
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe:
>> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list