[Numpy-discussion] Views of memmaps and offset

Olivier Grisel olivier.grisel at ensta.org
Sat Sep 22 10:08:50 EDT 2012


2012/9/22 Gael Varoquaux <gael.varoquaux at normalesup.org>:
> Hi list,
>
> I am struggling with offsets on the view of a memmaped array. Consider
> the following:
>
> import numpy as np
>
> a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='w+')
> a[:] = np.arange(50)
> b = a[10:]
>
> Here, I have a.offset == 0 and b.offset == 0. In practice, the data in b
> is offset compared to the start of the file, given that it is a view
> computed with an offset.
>
> My goal is, given b, to find a way to open a new view on the file, e.g.
> in a different process. For this I need the offset.
>
> Any idea of how I can retrieve it? In the previous numpy versions, I
> could go from b to a using the 'base' attribute of a. This is no longer
> possible.
>
> Also, should the above behavior be considered as a bug?

Note: this question on applies on the current master of numpy. On
previously released versions of numpy it's possible to introspect
`b.base.strides`.

A similar question apply if a was itself open with an offset:

orig = np.memmap('tmp.mmap', dtype=np.float64, shape=100, mode='w+')
orig[:] = np.arange(orig.shape[0]) * -1.0  # negative markers to
detect under / overflows

a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='r+', offset=16)
a[:] = np.arange(50)
b = a[10:]

How to reopen the same view as b on the buffer allocated by orig with
the current API in numpy master?

These questions stem from the following effort to build tools for
efficient memory management of numpy based datastructures when working
with python multiprocessing pools:
https://github.com/joblib/joblib/pull/44

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel



More information about the NumPy-Discussion mailing list