[Numpy-discussion] Views of memmaps and offset

Olivier Grisel olivier.grisel at ensta.org
Sat Sep 22 14:06:56 EDT 2012


2012/9/22 Charles R Harris <charlesr.harris at gmail.com>:
>
>
> On Sat, Sep 22, 2012 at 11:52 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>>
>> On Sat, Sep 22, 2012 at 11:31 AM, Gael Varoquaux
>> <gael.varoquaux at normalesup.org> wrote:
>>>
>>> On Sat, Sep 22, 2012 at 11:16:27AM -0600, Charles R Harris wrote:
>>> >    I think this is a bug, taking a view should probably update the
>>> > offset.
>>>
>>> OK, we can include a fix for that alongside with the patch to keep track
>>> of the filename.
>>
>>
>> It already tracks the file name
>>
>> In [1]: a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='w+',
>> offset=4)
>>
>> In [2]: b = a[10:]
>>
>> In [3]: b.filename
>> Out[3]: '/home/charris/tmp.mmap'
>>
>> or did you mean something else? I was guessing the fix could be mad in the
>> same place that copied over the filename.
>>
>
> You can also tell it is a memmap
>
> In [4]: b._mmap
> Out[4]: <mmap.mmap at 0x2312570>

The problem is with:

>>> c = np.asarray(b)
>>> c.base
<mmap.mmap at 0x2312570>

But you loose the pointer to the filename and the offset. In previous
versions of numpy c.base used to be the np.memmap instance from which
c is an array view. That allowed to make efficient pickling without
any memory copy when doing single machine multiprocessing stuff by
introspecting the base ancestry.

This is no longer possible with the current base collapsing that is
happening in numpy master. The only way would be to replace the
mmap.mmap instance of a numpy.memmap object by a buffer implementation
that would wrap or derive from mmap.mmap but also preserve the
original filename and offset.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel



More information about the NumPy-Discussion mailing list