[Python-Dev] PEP: Extending the buffer protocol to share array information.

Wed Nov 1 01:13:37 CET 2006

Martin v. Löwis wrote:
> Travis E. Oliphant schrieb:
> 
>>    Several extensions to Python utilize the buffer protocol to share
>>    the location of a data-buffer that is really an N-dimensional
>>    array.  However, there is no standard way to exchange the
>>    additional N-dimensional array information so that the data-buffer
>>    is interpreted correctly.  The NumPy project introduced an array
>>    interface (http://numpy.scipy.org/array_interface.shtml) through a
>>    set of attributes on the object itself.  While this approach
>>    works, it requires attribute lookups which can be expensive when
>>    sharing many small arrays.  
> 
> 
> Can you please give examples for real-world applications of this
> interface, preferably examples involving multiple
> independently-developed libraries?
> ("this" being the current interface in NumPy - I understand that
>  the PEP's interface isn't implemented, yet)
> 

Examples of Need

     1) Suppose you have a image in *.jpg format that came from a
     camera and you want to apply Fourier-based image recovery to try
     and de-blur the image using modified Wiener filtering.  Then you
     want to save the result in *.png format.  The PIL provides an easy
     way to read *.jpg files into Python and write the result to *.png 

     and NumPy provides the FFT and the array math needed to implement
     the algorithm.  Rather than have to dig into the details of how
     NumPy and the PIL interpret chunks of memory in order to write a
     "converter" between NumPy arrays and PIL arrays, there should be
     support in the buffer protocol so that one could write
     something like:

     # Read the image
     a = numpy.frombuffer(Image.open('myimage.jpg')).

     # Process the image.
     A = numpy.fft.fft2(a)
     B = A*inv_filter
     b = numpy.fft.ifft2(B).real

     # Write it out
     Image.frombuffer(b).save('filtered.png')

     Currently, without this proposal you have to worry about the "mode"
     the image is in and get it's shape using a specific method call
     (this method call is different for every object you might want to
     interface with).

     2) The same argument for a library that reads and writes
     audio or video formats exists.

     3) You want to blit images onto a GUI Image buffer for rapid
     updates but need to do math processing on the image values
     themselves or you want to read the images from files supported by
     the PIL.

     If the PIL supported the extended buffer protocol, then you would
     not need to worry about the "mode" and the "shape" of the Image.

     What's more, you would also be able to accept images from any
     object (like NumPy arrays or ctypes arrays) that supported the
     extended buffer protcol without having to learn how it shares
     information like shape and data-format.

I could have also included examples from PyGame, OpenGL, etc.  I thought 
people were more aware of this argument as we've made it several times 
over the years.  It's just taken this long to get to a point to start 
asking for something to get into Python.

> Paul Moore (IIRC) gave the example of equalising the green values
> and maximizing the red values in a PIL image by passing it to NumPy:
> Is that a realistic (even though not-yet real-world) example? 

I think so, but I've never done something like that.

If
> so, what algorithms of NumPy would I use to perform this image
> manipulation (and why would I use NumPy for it if I could just
> write a for loop that does that in pure Python, given PIL's
> getpixel/setdata)?

Basically you would use array math operations and reductions (ufuncs and 
it's methods which are included in NumPy).  You would do it this way for 
speed.   It's going to be a lot slower doing those loops in Python. 
NumPy provides the ability to do them at close-to-C speeds.

-Travis