[antlr-interest] Re: Java memory mapped IO is slow for big files :(
lgcraymer
lgc at mail1.jpl.nasa.gov
Wed Nov 17 14:22:19 PST 2004
--- In antlr-interest at yahoogroups.com, Terence Parr <parrt at c...> wrote:
> Howdy,
>
> In typical fashion, your expectations are not always met with java
> libraries. I'm using jdk 1.4.2 on my os x box. I expected that memory
> mapping a big file would be very fast, but it appears that reading it a
> chunk of a time is MUCH faster (even using ANTLR 2):
>
> Reading a 44M file 1 time:
>
> 2m15s memory mapped IO
> 1m05s ANTLR 2 small buffer
> 2m12s ANTLR 3 with char[size-of-file]
>
> So reading into a small buffer (BufferedReader) wins easily over making
> a huge buffer.
>
> Now reading a small 44 line (1173 byte) file 500 times:
>
> 0.69s memory mapped IO
> 2.35s ANTLR 2 small buffer
> 0.76s ANTLR 3 with char[size-of-file]
>
Ter--
Given the normal variations in timing, it looks like memory-mapped I/O
and char[file] are almost identical. That is reasonable--they should
be pretty close, and the figures look right: for the large file,
reading the entire file will be slightly faster than paging the
mmapped file, but for the large file, mmap will page in 4K while the
char[size-of-file] will do two separate accesses. What do you bet
that the difference between those and the buffered read() is bounds
checking? Try access via StringReader, StringIterator, and
ByteArrayInputStream--any one (or all) of those might have a more
efficient implementation that minimizes the bounds checks.
--Loring
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list