[antlr-interest] Out of Memory

Mark Boylan boylan.mark at gmail.com
Mon Oct 5 05:16:42 PDT 2009


It feels a little redundant, but I think that is the right solution.


On Mon, Oct 5, 2009 at 4:18 AM, Indhu Bharathi <indhu.b at s7software.com> wrote:
> Is it possible to write a separate program to break the PGN files into
> separate games and pass each game to the lexer/parser? That will be a simple
> solution assuming there is an easy way to split games in a PGN file.
>
>
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org
> [mailto:antlr-interest-bounces at antlr.org] On Behalf Of Mark Boylan
> Sent: Monday, October 05, 2009 4:49 AM
> To: antlr-interest at antlr.org
> Subject: Re: [antlr-interest] Out of Memory
>
> Answering my own question :)
>
> I wrote a class named ANTLRMemoryMappedFileStream which is nearly a
> complete ripoff of ANTLRStringStream, except backed by a memory-mapped
> file. That solved the buffering problem. But now, I get an
> OutOfMemoryError in the parser with my big test file.
>
> I'm parsing chess games in PGN (portable game notation) format. A game
> in PGN format is usually under 1k, but a PGN file can contain many
> games. Most PGN files have several thousand games and those are no
> problem for Antlr and my grammar. But, PGN files with a million or
> more games are not rare -- especially in the case where a database
> user wants to restore an entire collection, or move it to a new chess
> database management program (like the one I'm working on). So, it's
> important for me to be able to do parse these huge files.
>
> I'm wondering if it's possible for the Parser to notify the Stream
> that a game has been parsed. At that point, the Stream implementation
> can flip the buffer. Does that sound like something that might work?
> Is that possible?
>
>
>
> On Sun, Oct 4, 2009 at 4:07 AM, Mark Boylan <boylan.mark at gmail.com> wrote:
>> Hi.
>>
>> My grammar is working really well with smaller test files, but I run
>> out of heap space on large files. Unfortunately, my users will expect
>> to be able to load pretty big files occasionally (~1GB).
>>
>> Looking at the code documentation for the Antlr3 stream classes, it
>> looks like they copy the entire stream. I'm  thinking that I need to
>> write a custom implementation of IntStream or CharStream that buffers
>> the input. Is that the right way to solve this? Can someone point me
>> in the right direction?
>>
>> Thanks!
>>
>>  - mark
>>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
>


More information about the antlr-interest mailing list