[Numpy-discussion] Memory mapping and NPZ files
mathieu.dubois at icm-institute.org
Thu Dec 10 14:07:16 EST 2015
On 10/12/2015 15:35, Sebastian Berg wrote:
> On Mi, 2015-12-09 at 15:51 +0100, Mathieu Dubois wrote:
>> Dear all,
>> If I am correct, using mmap_mode with Npz files has no effect i.e.:
>> f = np.load("data.npz", mmap_mode="r")
>> X = f['X']
>> will load all the data in memory.
> My take on it is, that no, I do not want implicit extraction/copy of the
I agree it's controversial.
> However, npz files are not necessarily compressed, and I expect that in
> the non-compressed version, memory-mapping is possible on the
> uncompressed version.
> If that is possible, it would ideally work for uncompressed npz files
> and could raise an error which suggests to manually uncompress the file
> when mmap_mode is given.
I got the same idea this afternoon. I will test that soon.
Thanks for your constructive answer!
> - Sebastian
>> Can somebody confirm that?
>> If I'm correct, the mmap_mode argument could be passed to the NpzFile
>> class which could in turn perform the correct operation. One way to
>> handle that would be to use the ZipFile.extract method to write the
>> Npy file on disk and then load it with numpy.load with the mmap_mode
>> argument. Note that the user will have to remove the file to reclaim
>> disk space (I guess that's OK).
>> One problem that could arise is that the extracted Npy file can be
>> large (it's the purpose of using memory mapping) and therefore it may
>> be useful to offer some control on where this file is extracted (for
>> instance /tmp can be too small to extract the file here). numpy.load
>> could offer a new option for that (passed to ZipFile.extract).
>> Does it make sense?
>> Thanks in advance,
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion