[Python-Dev] PyBuffer* vs. array.array()

Bill Bumgarner bbum@codefab.com
Mon, 6 Jan 2003 10:00:23 -0500


On Sunday, Jan 5, 2003, at 16:58 US/Eastern, Guido van Rossum wrote:
>>          singlePlane = array.array('B')
>>          singlePlane.fromlist([0 for x in  range(0, width*height*3)] )
>
> I'm not sure if you were joking, but why not write
>
>          singlePlane.fromlist([0] * (width*height*3))
>
> ???

Not joking; not thinking and haven't really done large blob 
manipulation in Python before.   That answers another question, though 
-- if I were to build an image with four channels-- red, green, blue, 
alpha-- and wanted the alpha channel to be set to 255 throughout, then 
I would do...

     singlePlane.fromlist([0, 0, 0, 255] * (width * height))

... or ...

     array.array('B', [0, 0, 0, 255])  * width * height

>> ...........
>> --
>
> I'm not sure I understand the problem.

I was hoping that there was a single object type that could easily be 
used from both the C and Python side that could contain a large buffer 
of binary/byte data.

What I really need is a fixed length buffer that supports slicing style 
assignments / getters.  The type of the elements is largely irrelevant 
save for that each element needs to be accessed as a single byte.

The fixed length requirement comes from the need to encapsulate buffers 
of memory as returned by various APIs outside of Python.   In this 
case, I'm providing access to hunks of memory controlled by the APIs 
provided by the Foundation and the AppKit within Cocoa (or GNUstep).

I also need to allocate a hunk of memory-- an array of bytes, a string, 
a buffer, whatever-- and pass it off through the AppKit/Foundation 
APIs.   Once those APIs have the address and length of the buffer, that 
address and length must remain constant over time.   I would really 
like to be able to do the allocation from the Python side of the 
fence-- allocate, initialize with a particular byte pattern, and pass 
it off to the Foundation/AppKit (while still being able to manipulate 
the contents in Python).

The PyBuffer* C API seems to be ideal in that a buffer object produced 
via the PyBuffer_New() function is read/write (unlike a buffer produced 
by buffer() in Python), contains a reference to a fixed length array at 
a fixed address, and is truly a bag o' bytes.

At this point, I'll probably add some kind of an 'allocate' function to 
the 'objc' module that simply calls PyBuffer_New().

Did that -- works except, of course, the resulting buffer is an array 
of chars such that slicing assignments have to take strings.  
Inconvenient, but workable:

 >>> import objc
 >>> b = objc.allocateBuffer(100)
 >>> type(b)
<type 'buffer'>
 >>> b[0:10] = range(0,10)
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: bad argument type for built-in operation
 >>> b[0:10] = [chr(x) for x in range(0,10)]
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: bad argument type for built-in operation
 >>> b[0:10] = "".join([chr(x) for x in range(0,10)])
 >>> b
<read-write buffer ptr 0x1ad4bc, size 100 at 0x1ad4a0>
 >>> b[0:15]
'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\x00\x00\x00\x00\x00'

> You could use the 'c' code for creating an array instead of 'B'.

Right;  as long as it is a byte, it doesn't matter.  I chose 'B' 
because it is an unsigned numeric type.   Since I'm generating numeric 
data that is shoved into the bitmap as R,G,B triplets, a numeric type 
seemed to be the most convenient.

> Or you can use the tostring() method on the array to convert it to a
> string.
>
> Or you could use buffer() on the array.
> But why don't you just use strings for binary data, like everyone
> else?

Because strings are variable length, do not support slice style 
assignments, and require all numeric data to be converted to a string 
before being used as 'data'.

b.bum