[antlr-interest] problem about "the code for the static initializer is exceeding the 65535 bytes limit"

Wed Aug 15 17:43:16 PDT 2012

sounds promising :)

We have written a program to separate those constants into several inner
classes, solves for now.

Yours is definitely better:)
在 2012-8-16 上午1:13，"Francis ANDRE" <francis.andre.kampbell at orange.fr>写道：

>  Le 15/08/2012 16:17, Zhaohui Yang a écrit :
>
> It's great someone is already trying a fix. I'd be glad to test your fix
> when it's out.
>
> Would you please introduce a bit what kind of fix is that? Is it for
> ANTLRWorks or ANTLR tool, is it a command line option for seperating FOLLOW
> set or supressing that, or something else?
>
> The 64K syndrone is a pure Java problem due to the constraint that the JVM
> does not support static initializer greater than 64K  -- shame on it --.
> Thus if you look to the generated lexer and parser, you will see certainly
> a lot of DFA classes, each of them having some static initializer values.
> The point is that the sum of the static initializer of all those DFAs is
> greater than 64K while the static initialization of each DFA is somewhat
> small or in most of case les than 64K. Thus, one solution is to extract all
> those DFAs classes and put them outside the lexer or the parser in fixed
> directories like the following pattern:
>
> Let <grammar> the directory of the grammar to generate, then all the
> generated DFAs will go in
>
> for the lexer's DFAs:    package <grammar>.lexer;
> for the parser's DAFs: package <grammar>.parser;
>
> and the reference of all those DFAs will be
> in the lexer:                 import <grammar>.lexer.*;
> in the parser                import <grammar>.parser.*;
>
> But hold on, the fix has to be approved by Terr and I did not yet submit
> it. It need to pass all unit tests of the ANTLR3.4 and I am working on
> it... there is a real challenge getting the parser/lexer compiled for java
> code generated without a package...; and all those unit tests are producing
> java parser/lexer at the top level directory.
>
>
> 2012/8/15 Francis ANDRE <francis.andre.kampbell at orange.fr>
>
>> Hi Zhaohui
>>
>> I am currently working on fixing this issues with antlr3.4... Once I will
>> have a proper patch, would you be interested in testing it??
>>
>> FA
>> Le 14/08/2012 18:05, Zhaohui Yang a écrit :
>>
>> Hi,
>>>
>>> Here we have a big grammar and the generated parser.java got a
>>> compilation
>>> : "the code for the static initializer is exceeding the 65535 bytes
>>> limit".
>>>
>>> I've searched the net for a while and found that is a widely known limit
>>> in
>>> JVM or Javac compiler, and not yet has an option to change it higher.
>>>
>>> On the ANTLR side, I found 2 solutions proposed by others, but neither of
>>> them is totally satisfying:
>>>
>>> 1. Seperate the big grammar into 2 *.g files, import one from the other.
>>>     Yes, this removes the compilation error with genereated Java. But
>>> ANTLRWorks does not support imported grammar well. E.g., I can not
>>> interpret a rule in the imported grammar, it's simply not in the rule
>>> list
>>> for interpreting. And gunit always fail with rules defined in imported
>>> grammar.
>>>
>>> 2. Modify the generated Java source, seperate the "FOLLOW_xxx_in_yyy"
>>> constants into several static classes and change references to them
>>> accordingly.
>>>     This is proposed here -
>>> http://www.antlr.org/pipermail/antlr-interest/2009-November/036608.html.
>>> The author of the post actually has a solution into ANTLR source code
>>> (some
>>> string template). But I can't find the attachment he referred to. And
>>> that's in 2009, I suspect the fix could be incompatible with current
>>> ANTLR
>>> version.
>>>     Without this fix we have to do the modificaiton manually or write a
>>> script for that. The script is not that easy.
>>>
>>> And we found a 3rd solution by ourself, that also involve changing the
>>> generated Java:
>>>
>>> 3. Remove those FOLLOW_... constant completely, and replace the
>>> references
>>> with "null".
>>>     Surprisingly this works, just no error recovery after this, not a
>>> problem for us. But we really worry this is unsafe, since it's not
>>> documented anywhere.
>>>
>>> After all, we're looking for any other solution that is easier to apply,
>>> asumming we'll be constantly changing the grammar and recompile the
>>> parser.
>>>
>>>   Maybe there is a way to get ANTLRWorks and gunit play well with
>>> imported
>>> grammar?
>>> Maybe there is already a commandline option for antlr Tool, that can
>>> genereate FOLLOW_... constants in seperate classes?
>>> Maybe there is already a commandline option for antlr Tool, that can
>>> supress FOLLOW_... constants code generation?
>>>
>>>
>>
>
>
> --
> Regards,
>
> Yang, Zhaohui
>
>
>