[issue10181] Problems with Py_buffer management in memoryobject.c (and elsewhere?)

Sun Feb 13 15:19:14 CET 2011

Antoine Pitrou <pitrou at free.fr> added the comment:

> The problem in the current way is that the structure sent to
> `bf_releasebuffer` does not contain the same data as what was filled
> in by `bf_getbuffer`, and since the contents are dup-ed,
> `bf_releasebuffer` is called multiple times with the same data.

Hmm, there's a misunderstanding. bf_releasebuffer is called exactly once
for each call to bf_getbuffer. Of course, bf_getbuffer can be called
several times!

>  So, `bf_releasebuffer` cannot rely on (i) the data in Py_buffer being
> what `bf_getbuffer` put there,

Well, why should it rely on that?

> So, `bf_releasebuffer` cannot be used to release any resources
> allocated in `bf_getbuffer`.

AFAICT, it can. That's what the "internal" pointer is for:

        This is for use internally by the exporting object. For example,
        this might be re-cast as an integer by the exporter and used to
        store flags about whether or not the shape, strides, and
        suboffsets arrays must be freed when the buffer is released. The
        consumer should never alter this value.

(http://docs.python.org/dev/c-api/buffer.html#Py_buffer.internal)

> > Some worrying things here:
> > 
> > * memoryview_getbuffer() doesn't call the original object's getbuffer.
> >   This means that if I do:
> >         m = memoryview(some_object)
> >         n = memoryview(m)
> >         m.release()
> >   n ends up holding a buffer to some_object's memory, but some_object 
> >   doesn't know about it and can free the pointer at any time.
> 
> Good point. There are two possible solutions to this:
> 
> - Keep a count of how many buffers memoryview() has "exported", 
>   and do not allow memoryview.release() if some are active.

Where would that count be stored?

>   In a sense, this would be more in line with the PEP:
>   the PyMemoryViewObject would here act as an ordinary object
>   exporting some block of memory, and not do any magic tricks.

Well, that sounds wrong to me. The memoryview doesn't export anything;
the original object does.

>   It would guarantee that the buffers it has "exported" stay valid.

How would it, since it doesn't know the original object's semantics?

> - Add additional fields to `PyMemoryViewObject` for storing
>   new `strides`, `format`, and `shape` that override the stuff
>   in Py_buffer.
> 
>   This would allow for calling `PyObject_GetBuffer` for a second time.

Sounds better to me :)

> Calling PyObject_GetBuffer to get a new hold of a buffer needs some precautions, though. For example:
> 
>     >>> mem = memoryview(some_object)
>     # do some stuff
>     >>> mem2 = memoryview(some_object)
>     >>> assert mem.format == mem2.format  # not guaranteed

Well, if the original object decides to change the format between two
calls, then memoryview() should simply reflect that.

> > * same for _get_sub_buffer_index() and _get_sub_buffer_slice0().
> >  Actually, the whole concept of non-owner memoryviews seems flawed.
> 
> If the "parent" memoryview keeps its the memory valid as long as such
> sub-memoryviews exist, such concerns go away.

So release() wouldn't do what it claims to do? That sounds misleading.

> > Some other things:
> >
> > * why do you accept the ellipsis when indexing? what is it supposed to 
> >   mean?
> 
> Same meaning as in Numpy. a[...] == a

Yes, but why we would do that in core Python? a[...] might have a
meaning in NumPy, but it has none in core Python currently.
I'm not against it, but it would warrant discussion on python-dev.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10181>
_______________________________________