[antlr-interest] problem about "the code for the static initializer is exceeding the 65535 bytes limit"

Francis ANDRE francis.andre.kampbell at orange.fr
Wed Aug 15 10:14:03 PDT 2012


Le 15/08/2012 16:17, Zhaohui Yang a écrit :
> It's great someone is already trying a fix. I'd be glad to test your 
> fix when it's out.
>
> Would you please introduce a bit what kind of fix is that? Is it for 
> ANTLRWorks or ANTLR tool, is it a command line option for seperating 
> FOLLOW set or supressing that, or something else?
The 64K syndrone is a pure Java problem due to the constraint that the 
JVM does not support static initializer greater than 64K  -- shame on it 
--. Thus if you look to the generated lexer and parser, you will see 
certainly a lot of DFA classes, each of them having some static 
initializer values. The point is that the sum of the static initializer 
of all those DFAs is greater than 64K while the static initialization of 
each DFA is somewhat small or in most of case les than 64K. Thus, one 
solution is to extract all those DFAs classes and put them outside the 
lexer or the parser in fixed directories like the following pattern:

Let <grammar> the directory of the grammar to generate, then all the 
generated DFAs will go in

for the lexer's DFAs:    package <grammar>.lexer;
for the parser's DAFs: package <grammar>.parser;

and the reference of all those DFAs will be
in the lexer:                 import <grammar>.lexer.*;
in the parser                import <grammar>.parser.*;

But hold on, the fix has to be approved by Terr and I did not yet submit 
it. It need to pass all unit tests of the ANTLR3.4 and I am working on 
it... there is a real challenge getting the parser/lexer compiled for 
java code generated without a package...; and all those unit tests are 
producing java parser/lexer at the top level directory.
>
> 2012/8/15 Francis ANDRE <francis.andre.kampbell at orange.fr 
> <mailto:francis.andre.kampbell at orange.fr>>
>
>     Hi Zhaohui
>
>     I am currently working on fixing this issues with antlr3.4... Once
>     I will have a proper patch, would you be interested in testing it??
>
>     FA
>     Le 14/08/2012 18:05, Zhaohui Yang a écrit :
>
>         Hi,
>
>         Here we have a big grammar and the generated parser.java got a
>         compilation
>         : "the code for the static initializer is exceeding the 65535
>         bytes limit".
>
>         I've searched the net for a while and found that is a widely
>         known limit in
>         JVM or Javac compiler, and not yet has an option to change it
>         higher.
>
>         On the ANTLR side, I found 2 solutions proposed by others, but
>         neither of
>         them is totally satisfying:
>
>         1. Seperate the big grammar into 2 *.g files, import one from
>         the other.
>             Yes, this removes the compilation error with genereated
>         Java. But
>         ANTLRWorks does not support imported grammar well. E.g., I can not
>         interpret a rule in the imported grammar, it's simply not in
>         the rule list
>         for interpreting. And gunit always fail with rules defined in
>         imported
>         grammar.
>
>         2. Modify the generated Java source, seperate the
>         "FOLLOW_xxx_in_yyy"
>         constants into several static classes and change references to
>         them
>         accordingly.
>             This is proposed here -
>         http://www.antlr.org/pipermail/antlr-interest/2009-November/036608.html
>         .
>         The author of the post actually has a solution into ANTLR
>         source code (some
>         string template). But I can't find the attachment he referred
>         to. And
>         that's in 2009, I suspect the fix could be incompatible with
>         current ANTLR
>         version.
>             Without this fix we have to do the modificaiton manually
>         or write a
>         script for that. The script is not that easy.
>
>         And we found a 3rd solution by ourself, that also involve
>         changing the
>         generated Java:
>
>         3. Remove those FOLLOW_... constant completely, and replace
>         the references
>         with "null".
>             Surprisingly this works, just no error recovery after
>         this, not a
>         problem for us. But we really worry this is unsafe, since it's not
>         documented anywhere.
>
>         After all, we're looking for any other solution that is easier
>         to apply,
>         asumming we'll be constantly changing the grammar and
>         recompile the parser.
>
>           Maybe there is a way to get ANTLRWorks and gunit play well
>         with imported
>         grammar?
>         Maybe there is already a commandline option for antlr Tool,
>         that can
>         genereate FOLLOW_... constants in seperate classes?
>         Maybe there is already a commandline option for antlr Tool,
>         that can
>         supress FOLLOW_... constants code generation?
>
>
>
>
>
> -- 
> Regards,
>
> Yang, Zhaohui
>



More information about the antlr-interest mailing list