[Numpy-discussion] Views of memmaps and offset

Charles R Harris charlesr.harris at gmail.com
Sat Sep 22 14:19:53 EDT 2012


On Sat, Sep 22, 2012 at 12:06 PM, Olivier Grisel
<olivier.grisel at ensta.org>wrote:

> 2012/9/22 Charles R Harris <charlesr.harris at gmail.com>:
> >
> >
> > On Sat, Sep 22, 2012 at 11:52 AM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >>
> >>
> >>
> >> On Sat, Sep 22, 2012 at 11:31 AM, Gael Varoquaux
> >> <gael.varoquaux at normalesup.org> wrote:
> >>>
> >>> On Sat, Sep 22, 2012 at 11:16:27AM -0600, Charles R Harris wrote:
> >>> >    I think this is a bug, taking a view should probably update the
> >>> > offset.
> >>>
> >>> OK, we can include a fix for that alongside with the patch to keep
> track
> >>> of the filename.
> >>
> >>
> >> It already tracks the file name
> >>
> >> In [1]: a = np.memmap('tmp.mmap', dtype=np.float64, shape=50, mode='w+',
> >> offset=4)
> >>
> >> In [2]: b = a[10:]
> >>
> >> In [3]: b.filename
> >> Out[3]: '/home/charris/tmp.mmap'
> >>
> >> or did you mean something else? I was guessing the fix could be mad in
> the
> >> same place that copied over the filename.
> >>
> >
> > You can also tell it is a memmap
> >
> > In [4]: b._mmap
> > Out[4]: <mmap.mmap at 0x2312570>
>
> The problem is with:
>
> >>> c = np.asarray(b)
> >>> c.base
> <mmap.mmap at 0x2312570>
>
> But you loose the pointer to the filename and the offset. In previous
> versions of numpy c.base used to be the np.memmap instance from which
> c is an array view. That allowed to make efficient pickling without
> any memory copy when doing single machine multiprocessing stuff by
> introspecting the base ancestry.
>
> This is no longer possible with the current base collapsing that is
> happening in numpy master. The only way would be to replace the
> mmap.mmap instance of a numpy.memmap object by a buffer implementation
> that would wrap or derive from mmap.mmap but also preserve the
> original filename and offset.
>

Pickling was left as an unresolved problem after to offset updates to
memmap. It would be nice to get all those issues fixed up.

As to the 1.7 release, I've been thinking we are violating the release
early, release often maxim. Bugs trickle in at a constant rate and if we
wait to fix them all we wait forever. So while it would be nice to have
this in 1.7.0, I think we should also plan on a 1.7.1 bug fix release a few
months after the 1.7.0 release.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20120922/2bb25ab0/attachment.html>


More information about the NumPy-Discussion mailing list