[Numpy-discussion] Memory mapping and NPZ files

Mathieu Dubois mathieu.dubois at icm-institute.org
Thu Dec 10 14:07:16 EST 2015



On 10/12/2015 15:35, Sebastian Berg wrote:
> On Mi, 2015-12-09 at 15:51 +0100, Mathieu Dubois wrote:
>> Dear all,
>>
>> If I am correct, using mmap_mode with Npz files has no effect i.e.:
>> f = np.load("data.npz", mmap_mode="r")
>> X = f['X']
>> will load all the data in memory.
>>
> My take on it is, that no, I do not want implicit extraction/copy of the
> file.
I agree it's controversial.
> However, npz files are not necessarily compressed, and I expect that in
> the non-compressed version, memory-mapping is possible on the
> uncompressed version.
> If that is possible, it would ideally work for uncompressed npz files
> and could raise an error which suggests to manually uncompress the file
> when mmap_mode is given.
I got the same idea this afternoon. I will test that soon.

Thanks for your constructive answer!
Mathieu

> - Sebastian
>
>> Can somebody confirm that?
>>
>> If I'm correct, the mmap_mode argument could be passed to the NpzFile
>> class which could in turn perform the correct operation. One way to
>> handle that would be to use the ZipFile.extract method to write the
>> Npy file on disk and then load it with numpy.load with the mmap_mode
>> argument. Note that the user will have to remove the file to reclaim
>> disk space (I guess that's OK).
>>
>> One problem that could arise is that the extracted Npy file can be
>> large (it's the purpose of using memory mapping) and therefore it may
>> be useful to offer some control on where this file is extracted (for
>> instance /tmp can be too small to extract the file here). numpy.load
>> could offer a new option for that (passed to ZipFile.extract).
>>
>> Does it make sense?
>>
>> Thanks in advance,
>> Mathieu
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151210/4e464d75/attachment.html>


More information about the NumPy-Discussion mailing list