[Numpy-discussion] np.array, copy=False and memmap

Thomas Jollans tjol at tjol.eu
Wed Sep 6 06:16:10 EDT 2017


On 2017-08-07 23:01, Nisoli Isaia wrote:
> Dear all,
> I have a question about the behaviour of 
> |
> |
> |y ||=||np.array(x, copy||=||False||, dtype||=||'float32'||)|
> 
> when x is a memmap. If we check the memmap attribute of mmap> |
> |
> |print||"mmap attribute"||, y._mmap|
> |
> |
> numpy tells us that y is not a memmap.

Regardless of any bugs exposed by the snippet of code below, everything
is fine here. You created y as an array, so it's an array, not a memmap.
Maybe it should be a memmap. It doesn't matter: it's still backed by a
memmap!


Python 2.7.5 (default, Aug  2 2017, 11:05:32)
Type "copyright", "credits" or "license" for more information.

IPython 5.4.1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: import numpy as np

In [2]: np.__version__
Out[2]: '1.13.0'

In [3]: with open('test_memmap', 'w+b') as fp:
   ...:     fp.write(b'\0' * 2048)
   ...:

In [4]: x = np.memmap('test_memmap', dtype='int16')

In [5]: x
Out[5]: memmap([0, 0, 0, ..., 0, 0, 0], dtype=int16)

In [6]: id(x)
Out[6]: 47365848

In [7]: y = np.array(x, copy=False)

In [8]: y
Out[8]: array([0, 0, 0, ..., 0, 0, 0], dtype=int16)

In [9]: del x

In [10]: y.base
Out[10]: memmap([0, 0, 0, ..., 0, 0, 0], dtype=int16)

In [11]: id(y.base) == Out[6]
Out[11]: True

In [12]: y[:] = 0x0102

In [13]: y
Out[13]: array([258, 258, 258, ..., 258, 258, 258], dtype=int16)

In [14]: del y

In [15]: with open('test_memmap', 'rb') as fp:
    ...:     print [ord(c) for c in fp.read(10)]
    ...:
[2, 1, 2, 1, 2, 1, 2, 1, 2, 1]

In [16]:


> But the following code snippet crashes the python interpreter
> 
> |# opens the memmap|
> |with ||open||(filename,||'r+b'||) as f:|
> |      ||mm ||=| |mmap.mmap(f.fileno(),||0||)|
> |      ||x ||=| |np.frombuffer(mm, dtype||=||'float32'||)|
>  
> |# builds an array from the memmap, with the option copy=False|
> |y ||=| |np.array(x, copy||=||False||, dtype||=||'float32'||)|
> |print| |"before"||, y|
>  
> |# closes the file|
> |mm.close()|
> |print| |"after"||, y|
> 
> In my code I use memmaps to share read-only objects when doing parallel
> processing
> and the behaviour of np.array, even if not consistent, it's desirable.
> I share scipy sparse matrices over many processes and if np.array would
> make a copy
> when dealing with memmaps this would force me to rewrite part of the
> sparse matrices
> code.
> Would it be possible in the future releases of numpy to have np.array
> check, 
> if copy is false, if y is a memmap and in that case return a full memmap
> object
> instead of slicing it?
> 
> Best wishes
> Isaia
> 
> P.S. A longer account of the issue may be found on my university blog
> http://www.im.ufrj.br/nisoli/blog/?p=131
> 
> -- 
> Isaia Nisoli
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
> 


-- 
Thomas Jollans


More information about the NumPy-Discussion mailing list