[Numpy-discussion] Views of memmaps and offset

Gael Varoquaux gael.varoquaux at normalesup.org
Sat Sep 22 18:20:02 EDT 2012


On Sat, Sep 22, 2012 at 12:19:53PM -0600, Charles R Harris wrote:
>      But you loose the pointer to the filename and the offset. In previous
>      versions of numpy c.base used to be the np.memmap instance from which
>      c is an array view. That allowed to make efficient pickling without
>      any memory copy when doing single machine multiprocessing stuff by
>      introspecting the base ancestry.

>      This is no longer possible with the current base collapsing that is
>      happening in numpy master. The only way would be to replace the
>      mmap.mmap instance of a numpy.memmap object by a buffer implementation
>      that would wrap or derive from mmap.mmap but also preserve the
>      original filename and offset.

>    Pickling was left as an unresolved problem after to offset updates to
>    memmap.

To be clear, the issue here is not really a pickling issue where pickling
is a general purpose persistence model, but rather a specific and
focussed I/O problem. We don't want to swap out to disk and load back an
array that is a view on the disk.

With the current numpy, we can tell that it is indeed a view on the disk,
and we can tell what offset and strides, but we cannot tell from which
file it comes from, because that information is lost when deriving
children arrays.

Gael



More information about the NumPy-Discussion mailing list