[antlr-interest] ANTLR Parser file different on different machines - method to exceed 65535 characters

Tue Aug 23 03:27:44 PDT 2011

Hi, first of all, a little bit of background that you might find interesting in terms of what ANTLR is being used for...

I have been using ANTLR to convert a language called STL into JavaScript. This STL language is used to define control procedures for commanding European Space Agency ground station monitoring and control equipment. This is a custom language only used by ESA and is quite quirky in terms of grammar. The existing system is around 15 years old and is being replaced with a new system. These STL procedures are being converted into equivalent in JavaScript which will run in a Rhino engine. My grammar for doing this conversion is more or less complete and the new JavaScripts are successfully commanding ground station equipment running in a simulator. I should also mention that the target language for the grammar is Java and that I am using antlr-3.2.

Now for the problem and the reason for my message...

As part of the build process for the application doing the conversion, ANTLR is run to generate the lexer and parser files from ant before building the Java application. Previously I had been generating the Parser and Lexer in our environment and committing the resulting java files to the repository rather than performing the generation as part of the build. This works perfectly well here on our development machines on a number of different machines and even a virtual machine with a small amount of memory.

Typically however, when we deliver the software to ESA and they try to run the build process on their machines, the Parser file produced is different and the build fails. The reason for the build failure is that "static final String DFA53_specialS" and "static final String DFA53_transitionS" arrays are being produced with a huge number elements, together with a massive switch statement in a "specialStateTransition()" method that is causing the method to exceed 65535 characters.
I have read that this can occur with complex and/or unoptimised grammar and I will be the first to admit that the grammar I have written might not be 100% optimised. Since the parser generation works on our machines and not on ESAs, my limited ANTLR-foo is not the root cause of this problem. I have also confirmed that the version of ANTLR being used is exactly the same.

I have done a lot of searching through the mailing list archives and found a suggestion that using -Xconversiontimeout 100000 as an input to ANTLR might help solve this issue, but that doesn't seem to be helping. Just in case I did this wrong, this is how I used this option to generate the file:

java -Xms1024M -Xmx1024M -jar antlr-3.2.jar -Xconversiontimeout 100000

Is there anything else that might cause a large switch block in specialStateTransition() on one machine and not another with the same grammar/same ANTLR version? Should I get them to try with even higher values for Xconversiontimeout or am I barking up the wrong tree with this?

If necessary I can post the grammar I am using (minus the inline code) and an example of the STL language being parsed, but I don't think that's necessary at this stage. In any case, it will probably hurt your eyes.

Thanks very much in advance for any help.

Luke