[issue10181] Problems with Py_buffer management in memoryobject.c (and elsewhere?)

Pauli Virtanen report at bugs.python.org
Sun Feb 13 14:20:55 CET 2011


Pauli Virtanen <pav at iki.fi> added the comment:

Hi,

Please focus on the constraints of the consumer not mucking with the content of `Py_buffer`, and calling `bf_releasebuffer` only with data obtained from `bf_getbuffer` and only one. If we agree on this, then how to exactly to do the implementation is just a detail.

The problem in the current way is that the structure sent to `bf_releasebuffer` does not contain the same data as what was filled in by `bf_getbuffer`, and since the contents are dup-ed, `bf_releasebuffer` is called multiple times with the same data. So, `bf_releasebuffer` cannot rely on (i) the data in Py_buffer being what `bf_getbuffer` put there, and (ii) getting the same Py_buffer data only once.

So, `bf_releasebuffer` cannot be used to release any resources allocated in `bf_getbuffer`. For example, Numpy's PEP 3118 implementation does not define a `bf_releasebuffer` although it needs to allocate resources in each call to `bf_getbuffer`. Instead, it has to keep a track of allocated memory, and deallocate all of them on `array_dealloc`. This is a PITA, and one that could be avoided by a small change in the PEP implementation and documentation in Python.

> Some worrying things here:
> 
> * memoryview_getbuffer() doesn't call the original object's getbuffer.
>   This means that if I do:
>         m = memoryview(some_object)
>         n = memoryview(m)
>         m.release()
>   n ends up holding a buffer to some_object's memory, but some_object 
>   doesn't know about it and can free the pointer at any time.

Good point. There are two possible solutions to this:

- Keep a count of how many buffers memoryview() has "exported", 
  and do not allow memoryview.release() if some are active.

  In a sense, this would be more in line with the PEP:
  the PyMemoryViewObject would here act as an ordinary object
  exporting some block of memory, and not do any magic tricks.
  It would guarantee that the buffers it has "exported" stay valid.

- Add additional fields to `PyMemoryViewObject` for storing
  new `strides`, `format`, and `shape` that override the stuff
  in Py_buffer.

  This would allow for calling `PyObject_GetBuffer` for a second time.

Both would be reasonably easy to do.

    ***

Calling PyObject_GetBuffer to get a new hold of a buffer needs some precautions, though. For example:

    >>> mem = memoryview(some_object)
    # do some stuff
    >>> mem2 = memoryview(some_object)
    >>> assert mem.format == mem2.format  # not guaranteed

so there is some extra work `memoryview_getbuffer` could do on top of calling PyObject_GetBuffer.

> * same for _get_sub_buffer_index() and _get_sub_buffer_slice0().
>  Actually, the whole concept of non-owner memoryviews seems flawed.

If the "parent" memoryview keeps its the memory valid as long as such sub-memoryviews exist, such concerns go away.

> * the fact that slicing increments the parent memoryview's refcount 
>  means that a loop like:
>     while len(m):
>       # do something
>       m = m[1:]
>  will end up keeping N chained memoryviews on the heap. Currently only
>  the last memoryview remains alive.

This has an easy solution: if the parent is a non-owner memoryview, take a reference to its obj. This then keeps only buffer-owning view on the stack.

> Some other things:
>
> * why do you accept the ellipsis when indexing? what is it supposed to 
>   mean?

Same meaning as in Numpy. a[...] == a

> * please don't use #warning. Just put the TODO in a /* ... */.

Sure.

> * please no "#if PY_VERSION_HEX >= 0x03000000". Just make your source
>   py3k-compatible and we'll take care of backporting it to 2.x if your
>  patch is accepted ;)

Those are just so that the backporting would be easy. It's not on the stage of a serious patch yet :)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10181>
_______________________________________


More information about the Python-bugs-list mailing list