[Python-Dev] An updated extended buffer PEP

Tue Mar 27 21:39:16 CEST 2007

Carl Banks wrote:
> Travis Oliphant wrote:
>> Travis Oliphant wrote:
>>> Hi Carl and Greg,
>>>
>>> Here is my updated PEP which incorporates several parts of the 
>>> discussions we have been having.
>> And here is the actual link:
>>
>> http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/pep_buffer.txt 
> 
> 
> What's the purpose of void** segments in PyObject_GetBuffer?  It seems 
> like it's leftover from an older incarnation?
> 

Yeah, I forgot to change that location.

> I'd hope after more recent discussion, we'll end up simplifying 
> releasebuffer.  It seems like it'd be a nightmare to keep track of what 
> you've released.

Yeah, I agree.   I think I'm leaning toward the bufferinfo structure 
which allows the exporter to copy memory for things that it wants to be 
free to change while the buffer is exported.

> 
> 
> Finally, the isptr thing.  It's just not sufficient.  Frankly, I'm 
> having doubts whether it's a good idea to support multibuffer at all. 
> Sure, it brings generality, but I'm thinking its too hard to explain and 
> too hard to get one's head around, and will lead to lots of 
> misunderstanding and bugginess.  OTOH, it really doen't burden anyone 
> except those who want to export multi-buffered arrays, and we only have 
> one shot to do it.  I just hope it doesn't confuse everyone so much that 
> no one bothers.

People used to have doubts about explaining strides in NumPy as well.  I 
sure would have hated to see them eliminate the possiblity because of 
those doubts. I think the addition you discuss is not difficult once you 
get a hold of it.

I also understand now why subbufferoffsets is needed.  I was thinking 
that for slices you would just re-create a whole other array of pointers 
to contain that addition.   But, that is really not advisable.  It makes 
sense when you are talking about a single pointer variable (like in 
NumPy) but it doesn't when you have an array of pointers.

Providing the example about how to extract the pointer from the returned 
information goes a long way towards clearing up any remaining confusion.

Your ImageObject example is also helpful.   I really like the addition 
and think it is clear enough and supports a lot of use cases with very 
little effort.

> 
> Here's how I would update the isptr thing.  I've changed "derefoff" to 
> "subbufferoffsets" to describe it better.
> 
> 
> typedef PyObject *(*getbufferproc)(PyObject *obj, void **buf,
>                                     Py_ssize_t *len, int *writeable,
>                                     char **format, int *ndims,
>                                     Py_ssize_t **shape,
>                                     Py_ssize_t **strides,
>                                     Py_ssize_t **subbufferoffsets);
> 
> 
> subbufferoffsets
> 
>    Used to export information about multibuffer arrays.  It is an
>    address of a ``Py_ssize_t *`` variable that will be set to point at
>    an array of ``Py_ssize_t`` of length ``*ndims``.
> 
>    [I don't even want to try a verbal description.]
> 
>    To demonstrate how subbufferoffsets works, here is am example of a
>    function that returns a pointer to an element of ANY N-dimensional
>    array, single- or multi-buffered.
> 
>     void* get_item_pointer(int ndim, void* buf, Py_ssize_t* strides,
>                          Py_ssize_t* subarrayoffs, Py_ssize_t *indices) {
>          char* pointer = (char*)buf;
>          int i;
>          for (i = 0; i < ndim; i++) {
>              pointer += strides[i]*indices[i];
>              if (subarraysoffs[i] >= 0) {
>                  pointer = *(char**)pointer + subarraysoffs[i];
>              }
>          }
>          return (void*)pointer;
>      }
> 
>    For single buffers, subbufferoffsets is negative in every dimension
>    and it reduces to normal single-buffer indexing.  

What about just having subbufferoffsets be NULL in this case?  i.e. you 
don't need it.    If some of the dimensions did not need dereferencing 
then they would be negative (how about we just say -1 to be explicit)?

>    For multi-buffers,
>    subbufferoffsets indicates when to dereference the pointer and switch
>    to the new buffer, and gives the offset into the buffer to start at.
>    In most cases, the subbufferoffset would be zero (indicating it should
>    start at the beginning of the new buffer), but can be a positive
>    number if the following dimension has been sliced, and thus the 0th
>    entry in that dimension would not be at the beginning of the new
>    buffer.
> 
> 
> 
> Other than that, looks good. :)
> 

I think we are getting closer.   What do you think about Greg's idea of 
basically making the provider the bufferinfo structure and having the 
exporter handle copying memory over for shape and strides if it wants to 
be able to change those before the lock is released.

-Travis