[Python-Dev] Extended Buffer Interface/Protocol
Travis Oliphant
oliphant.travis at ieee.org
Thu Mar 22 19:53:16 CET 2007
Greg Ewing wrote:
> Travis Oliphant wrote:
>
>
>>I'm talking about arrays of pointers to other arrays:
>>
>>i.e. if somebody defined in C
>>
>>float B[10][20]
>>
>>then B would B an array of pointers to arrays of floats.
>
>
> No, it wouldn't, it would be a contiguously stored
> 2-dimensional array of floats. An array of pointers
> would be
>
> float *B[10];
>
> followed by code to allocate 10 arrays of 20 floats
> each and initialise B to point to them.
>
You are right, of course, that example was not correct. I think the
point is still valid, though. One could still use the shape to
indicate how many levels of pointers-to-pointers there are (i.e. how
many pointer dereferences are needed to select out an element). Further
dimensionality could then be reported in the format string.
This would not be hard to allow. It also would not be hard to write a
utility function to copy such shared memory into a contiguous segment to
provide a C-API that allows casual users to avoid the details of memory
layout when they are writing an algorithm that just uses the memory.
> I can imagine cases like that coming up in practice.
> For example, an image object might store its data
> as four blocks of memory for R, G, B and A planes,
> each of which is a contiguous 2d array with shape
> and stride -- but you want to view it as a 3d
> array byte[plane][x][y].
All we can do is have the interface actually be able to describe it's
data. Users would have to take that information and write code
accordingly.
In this case, for example, one possibility is that the object would
raise an error if strides were requested. It would also raise an error
if contiguous data was requested (or I guess it could report the R
channel only if it wanted to). Only if segments were requested could
it return an array of pointers to the four memory blocks. It could then
report itself as a 2-d array of shape (4, H) where H is the height.
Each element of the array would be reported as "%sB" % W where W is the
width of the image (i.e. each element of the 2-d array would be a 1-d
array of length W.
Alternatively it could report itself as a 1-d array of shape (4,) with
elements "(H,W)B"
A user would have to write the algorithm correctly in order to access
the memory correctly.
Alternatively, a utility function that copies into a contiguous buffer
would allow the consumer to "not care" about exactly how the memory is
layed out. But, the buffer interface would allow the utility function
to figure it out and do the right thing for each exporter. This
flexibility would not be available if we don't allow for segmented
memory in the buffer interface.
So, I don't think it's that hard to at least allow the multiple-segment
idea into the buffer interface (as long as all the segments are the same
size, mind you). It's only one more argument to the getbuffer call.
-Travis
More information about the Python-Dev
mailing list