[antlr-interest] Re: Java memory mapped IO is slow for big files :(

lgcraymer lgc at mail1.jpl.nasa.gov
Wed Nov 17 14:22:19 PST 2004



--- In antlr-interest at yahoogroups.com, Terence Parr <parrt at c...> wrote:
> Howdy,
> 
> In typical fashion, your expectations are not always met with java 
> libraries.  I'm using jdk 1.4.2 on my os x box.  I expected that memory 
> mapping a big file would be very fast, but it appears that reading it a 
> chunk of a time is MUCH  faster (even using ANTLR 2):
> 
> Reading a 44M file 1 time:
> 
> 2m15s memory mapped IO
> 1m05s ANTLR 2 small buffer
> 2m12s ANTLR 3 with char[size-of-file]
> 
> So reading into a small buffer (BufferedReader) wins easily over making 
> a huge buffer.
> 
> Now reading a small 44 line (1173 byte) file 500 times:
> 
> 0.69s memory mapped IO
> 2.35s ANTLR 2 small buffer
> 0.76s ANTLR 3 with char[size-of-file]
> 

Ter--

Given the normal variations in timing, it looks like memory-mapped I/O
and char[file] are almost identical.  That is reasonable--they should
be pretty close, and the figures look right:  for the large file,
reading the entire file will be slightly faster than paging the
mmapped file, but for the large file, mmap will page in 4K while the
char[size-of-file] will do two separate accesses.  What do you bet
that the difference between those and the buffered read() is bounds
checking?  Try access via StringReader, StringIterator, and
ByteArrayInputStream--any one (or all) of those might have a more
efficient implementation that minimizes the bounds checks.

--Loring





 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 





More information about the antlr-interest mailing list