[Numpy-discussion] Accessing data in a large file

Simon Lyngby Kokkendorff silyko at gmail.com
Thu Jun 17 04:21:04 EDT 2010


Hi list,

   I am new to this list, so forgive me if this is a trivial problem,
however i would appreciate any help.

  I am using numpy to work with large amounts of data - sometimes too much
to fit into memory. Therefore I want to be able to store data in binary
files and use numpy to read chunks of the file into memory. I've tried to
use numpy.memmap and numpy.load and numpy.save with mmap_mode="r". However
when I try to perform any nontrivial operation on a slice of the memmap I
always end up reading the entire file into memory - which then leads to
memory errors. Is there a way to get numpy to do what I want, using an
internal platform independent numpy-format like .npy, or do I have to wrap a
custom file reader with something like ctypes?
Of course numpy.fromfile is a possibility, but it seems to be a quite
inflexible alternative as it doesn't really support slices and might have a
problem with platform dependency (byte order).

Hope that someone can help, cheers,
 Simon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100617/d7217209/attachment.html>


More information about the NumPy-Discussion mailing list