[Python-3000] patch: bytes object PyBUF_LOCKDATA read-only and immutable support

Tue Sep 11 07:10:48 CEST 2007

Guido van Rossum wrote:
> I'd like to see Travis's response to this. It's setting a precedent
> regarding locking objects in read-only mode; I haven't found other
> examples of objects using LOCKDATA (the only mentions of it seem to be
> rejecting it :). I keep getting confused by the two separate lock
> counts (and I think in this version the comment is inconsistent with
> the code). So I'm hoping Travis has a particular way in mind of
> handling LOCKDATA that can be used as a template.
>
> Travis?
>   

The use case I had in mind comes about quite often in NumPy when you 
want to modify the data-area of an object which may have a 
non-contiguous chunk of memory, but the algorithm being used expects 
contiguous data.  Imagine, for example, that the exporting object is an 
image whose rows are stored in different segments.  

The consumer of the buffer interface, however, may be an extension 
module that does fast image-processing operations and requires 
contiguous data.  Because it wants to write the results back in to the 
memory area when it is done with the algorithm (which may be thread-safe 
and may release the GIL), it requests the object to lock its data to 
read-only so that other consumers do not try to get writeable buffers 
while it is processing.

When the algorithm is done, it alone can write to the memory area and 
then when it releases the buffer, the original object will restore 
itself to being writeable.  Of course, the exporting object must support 
this kind of operation and not all objects will.  I expect the NumPy 
array object and the PIL to support it for example, and other 
media-centric objects.  

It would probably be useful if the bytes object supported it because 
then other objects could use it as the memory area.    To do it 
correctly, the object exporting the interface must only allow locking if 
no other writeable interfaces have been exported (which it must keep 
track of) and then on release must check to see if the buffer that is 
being released is the one that locked its data.

For a real-life example, NumPy has a flag called UPDATEIFCOPY that is a 
slightly different implementation of the concept.   When this flag is 
set during conversion to an array, then if a copy must be made to 
satisfy the requirements, the original array is set as read-only and 
this special flag is set on the array.  When the copy is deleted, its 
memory is automatically copied (and possibly casted, etc.) back into the 
original array.  It is a nice abstraction of the concept of an output 
data area that was borrowed from Numarray and allows many things to be 
implemented very quickly in NumPy.

One of the main things people use the NumPy C-API for is to get a 
contiguous chunk of memory from an array in order to do processing in 
another language (such as C or Fortran).   It is nice to be able to 
specify that the result gets placed back into another chunk of memory 
(which may or may not be contiguous) in a unified fashion.   NumPy 
handles all the copying for you.  

My thinking was that many people will want to be able to get contiguous 
chunks of memory, do processing, and then copy the result back into a 
segment of memory from a buffer-exporting object which is passed into 
the routine as an output object.

I'm not sure if my explanations are helpful.  Please let me know if I 
can explain further. 

-Travis