[antlr-interest] (no subject)

Sam Harwell sam at tunnelvisionlabs.com
Thu Nov 15 10:56:14 PST 2012


While the string length issue is straightforward to analyze, it's very hard to predict the compiled class size to even know when to split up the code, much less what the optimal division would look like.

Regarding #2, Microsoft thought this through much, much better than the JVM team. I think the code size limit in C# is some 4 orders of magnitude greater than in Java (.NET limit over 500MB compared to 64K in Java), and the internal representation of array initialization data is much more compact. The performance of the C# port should absolutely obliterate even my optimized Java version... once I port it that is... My custom tuned build of v3 is more than 4x faster than the Java target, and I'm planning to use several things I learned from that in the v4 port.

--
Sam Harwell
Owner, Lead Developer
http://tunnelvisionlabs.com

From: Pascal Parrot [mailto:pascal_parrot at hotmail.com]
Sent: Thursday, November 15, 2012 12:37 PM
To: Sam Harwell; antlr-interest at antlr.org
Subject: RE: [antlr-interest] (no subject)


One sample grammar (example.g4) is attached there:
http://www.antlr.org/pipermail/antlr-interest/attachments/20121114/c1188d89/attachment.zip

The initial version had a "string too long error". I no longer have the error if I use a hashmap for keywords, as described in the reference book (see attachment in link). However, I am getting a new error now, even when I use the -Xforce-atn option.

This is just an example grammar, so if the error is fixed, a new "too large, too long" error will probably pop up somewhere else. So, it brings up 2 questions:
1) Is it in the antlr roadmap to check for java size limitations in the generated code and break it if necessary?
2) If not, using a different target language is the only option for large grammars, isn't it?

Pascal
> From: sam at tunnelvisionlabs.com<mailto:sam at tunnelvisionlabs.com>
> To: pascal_parrot at hotmail.com<mailto:pascal_parrot at hotmail.com>; antlr-interest at antlr.org<mailto:antlr-interest at antlr.org>
> Subject: RE: [antlr-interest] (no subject)
> Date: Thu, 15 Nov 2012 15:15:29 +0000
>
> ANTLR 4 is not currently optimized for this use case. You might be able to reduce the code size a bit by passing the -Xforce-atn flag when you generate your grammar.
>
> To help with optimizing the generated code, can you provide me with one of the grammars that's causing a problem?
>
> Thank you,
> --
> Sam Harwell
> Owner, Lead Developer
> http://tunnelvisionlabs.com
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org<mailto:antlr-interest-bounces at antlr.org> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Pascal Parrot
> Sent: Thursday, November 15, 2012 1:55 AM
> To: antlr-interest at antlr.org<mailto:antlr-interest at antlr.org>
> Subject: Re: [antlr-interest] (no subject)
>
>
> Jim,
> Yes, I looked a the generated code, but the error is on this line:
> protected static final PredictionContextCache _sharedContextCache = new PredictionContextCache(); PredictionContextCache does not appear anywhere else in the file and _sharedContextCache is a parameter in a function.
>
> Even if there was a huge something there, I wouldn't know what to do with it.
>
> I guess my question is:
> Is antlr (java) suited for grammars with large sets of keywords and many parser rules?
> If it is, how should the grammar be organized so that the generated code fits within the limits of java in terms of size? Using hashmaps helps in the lexer, but what about in the parser?
>
> Pascal
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>


More information about the antlr-interest mailing list