[antlr-interest] Out of Memory

Indhu Bharathi indhu.b at s7software.com
Mon Oct 5 01:18:50 PDT 2009


Is it possible to write a separate program to break the PGN files into
separate games and pass each game to the lexer/parser? That will be a simple
solution assuming there is an easy way to split games in a PGN file.


-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Mark Boylan
Sent: Monday, October 05, 2009 4:49 AM
To: antlr-interest at antlr.org
Subject: Re: [antlr-interest] Out of Memory

Answering my own question :)

I wrote a class named ANTLRMemoryMappedFileStream which is nearly a
complete ripoff of ANTLRStringStream, except backed by a memory-mapped
file. That solved the buffering problem. But now, I get an
OutOfMemoryError in the parser with my big test file.

I'm parsing chess games in PGN (portable game notation) format. A game
in PGN format is usually under 1k, but a PGN file can contain many
games. Most PGN files have several thousand games and those are no
problem for Antlr and my grammar. But, PGN files with a million or
more games are not rare -- especially in the case where a database
user wants to restore an entire collection, or move it to a new chess
database management program (like the one I'm working on). So, it's
important for me to be able to do parse these huge files.

I'm wondering if it's possible for the Parser to notify the Stream
that a game has been parsed. At that point, the Stream implementation
can flip the buffer. Does that sound like something that might work?
Is that possible?



On Sun, Oct 4, 2009 at 4:07 AM, Mark Boylan <boylan.mark at gmail.com> wrote:
> Hi.
>
> My grammar is working really well with smaller test files, but I run
> out of heap space on large files. Unfortunately, my users will expect
> to be able to load pretty big files occasionally (~1GB).
>
> Looking at the code documentation for the Antlr3 stream classes, it
> looks like they copy the entire stream. I'm  thinking that I need to
> write a custom implementation of IntStream or CharStream that buffers
> the input. Is that the right way to solve this? Can someone point me
> in the right direction?
>
> Thanks!
>
>  - mark
>

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list