[antlr-interest] ANTLR gives segmentation fault for very large input

Sam Harwell sharwell at pixelminegames.com
Thu Jul 21 08:45:15 PDT 2011


To skip the AST, just don't use the "output=AST" option.

Here are some specs on the tokens. I'm including the overhead of having them in a CommonTokenStream (or equivalent) because they're not very useful otherwise.

Java target, 32-bit VM: 48 bytes/token.
Java target, 64-bit VM: 64 bytes/token.

CSharp3 target: Same as Java target.

C target, 32-bit: 148 bytes/token.
C target, 64-bit: 248 bytes/token.

You have 6 tokens per line, and it sounds like you're using the C target. The small/large files use 39KiB/3.72GiB of memory respectively for the tokens on a 32-bit machine. They use 65.4KiB/6.24GiB on a 64-bit machine.

I'm developing an alternative to CommonToken that uses 8 bytes/token in all of the above targets. Once it's ready (which may not be until ANTLR v4), the same files will only need 2.1KiB/206MiB of memory, a savings of 94.6% on the 32-bit C target, and nearly 97% on the 64-bit C target.

-----Original Message-----
From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Piyush
Sent: Thursday, July 21, 2011 10:18 AM
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] ANTLR gives segmentation fault for very large input

Is there is any way to delete AST (Abstract Syntax Tree) because it is of no use for my work.

On Tue, Jul 19, 2011 at 9:08 PM, Jim Idle [via ANTLR] <ml-node+6599207-454424018-346774 at n2.nabble.com> wrote:
> You are running out of memory - split up the input in some sensible way.
>
> Jim
>
>> -----Original Message-----
>> From: [hidden email] [mailto:antlr-interest- [hidden email]] On 
>> Behalf Of Piyush
>> Sent: Tuesday, July 19, 2011 1:51 AM
>> To: [hidden email]
>> Subject: [antlr-interest] ANTLR gives segmentation fault for very 
>> large input
>>
>> Sir when i am trying to parse a very big input file (of nearly
>> 4500000 lines) ANTLR is giving segmentation fault.
>>
>>  Just for example my grammar funny.g is parsing input file input.v(of 
>> near about 45 lines) but gives segmentation fault for 
>> big_file_input.v(of about 4500000 lines) ,which also contains the 
>> same input as of input.v 100000 times
>>
>> I am attaching my grammar (funny.g and input files) below.
>>
>> So please help me out where i am doing wrong or this is antlr's bug?
>>
>>
>> Thanking You
>> Piyush http://antlr.1301665.n2.nabble.com/file/n6598011/fun.tar 
>> fun.tar
>>
>> --
>> View this message in context: 
>> http://antlr.1301665.n2.nabble.com/ANTLR-
>> gives-segmentation-fault-for-very-large-input-tp6598011p6598011.html
>> Sent from the ANTLR mailing list archive at Nabble.com.
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: 
>> http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>
> ________________________________
> If you reply to this email, your message will be added to the 
> discussion
> below:
> http://antlr.1301665.n2.nabble.com/ANTLR-gives-segmentation-fault-for-
> very-large-input-tp6598011p6599207.html
> To unsubscribe from ANTLR gives segmentation fault for very large 
> input, click here.



Cheers!
Piyush
Bengal Engineering & Science University
Computer Science and Technology


--
View this message in context: http://antlr.1301665.n2.nabble.com/ANTLR-gives-segmentation-fault-for-very-large-input-tp6598011p6607198.html
Sent from the ANTLR mailing list archive at Nabble.com.

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list