[Python-Dev] Fw: Behavior of buffer()

Todd Miller jmiller@stsci.edu
Fri, 19 Jul 2002 07:29:37 -0400


This is a re-post in plain text of a message I sent yesterday in HTML. 
 Anyone not "consumed with interest" in the buffer object should 
probably skip it.  

Scott Gilbert wrote:

>--- Todd Miller <jmiller@stsci.edu> wrote:
>
>>>I don't understand what you say, but I believe you.
>>>
>>I meant we call  PyBuffer_FromReadWriteObject and the resulting buffer 
>>lives longer than the extension function call that created it.   I have 
>>heard that it is possible for the original object to "move" leaving the 
>>buffer object pointer to it dangling.
>>
>
>Yes.  The PyBufferObject grabs the pointer from the PyBufferProcs
>supporting object when the PyBufferObject is created.  If the PyBufferProcs
>supporting object reallocates the memory (possibly from a resize) the
>
Thanks for the example.

>
>PyBufferObject can be left with a bad pointer.  This is easily possible if
>you try to use the array module arrays as a buffer.
>
This is good to know.

>
>
>I've submitted a patch to fix this particular problem (among others), but
>there are still enough things that the buffer object can't do that
>something new is needed.
>
I understand.  I saw your patches and they sounded good to me.

>
>>>
>>>>>Maybe instead of the buffer() function/type, there should be a way to
>>>>>allocate raw memory?
>>>>>
>>>>Yes.    It would also be nice to be able to:
>>>>
>>>>1.  Know (at the python level) that a type supports the buffer C-API.
>>>>
>>>Good idea.  (I guess right now you can see if calling buffer() with an
>>>instance as argument works. :-)
>>>
>>>>2.  Copy bytes from one buffer to another (writeable buffer).  
>>>>
>
>And the copy operations shouldn't create any large temporaries:
>
I agree with this completely.    I could summarize my opinion by saying 
that while
I regard the current buffering system as pretty complete,  the buffer 
object places emphasis
on the wrong behavior.  In terms of modelling memory regions, strings 
are the wrong way
to go.   

>
>
>  buf1 = memory(50000)
>  buf2 = memory(50000)
>  # no 10K temporary should be created in the next line
>  buf1[10000:20000] = buf2[30000:40000] 
>
>The current buffer object could be used like this, but it would create a
>temporary string.  
>
Looking at buffering most of this week, the fact that mmap slicing also 
returns strings is one justification I've found for having a buffer 
object,  i.e.,  mmap slicing is not a substitute for the buffer object. 
 The buffer object makes it possible to partition a mmap or any 
bufferable object into pseudo-independent, possibly writable, pieces.  

One justification to have a new buffer object is pickling (one of 
Scott's posts alerted me to this).   I think the behavior we want for 
numarray is to be able to pickle a view of a bufferable object more or 
less like a string containing the buffer image, and to unpickle it as a 
memory object.   The prospect of adding pickling support makes me wonder 
if seperating the allocator and view aspects of the buffer object is a 
good idea;  I thought it was, but now I wonder.

>
>So getting an efficient copy operation seems to require that slices just
>create new "views" to the same memory.
>
Other justifications for a new buffer object might be:

1. The ability to partition any bufferable object into regions which can 
be passed around.  These regions
would themselves be buffers.

2. The ability to efficiently pickle a view of any bufferable object.

>
>>>Maybe you would like to work on a requirements gathering for a memory
>>>object
>>>
>>Sure.  I'd be willing to poll comp.lang.python (python-list?) and 
>>collate the results of any discussion that ensues.  Is that what you had 
>>in mind?
>>
>
>
>In the PEP that I'm drafting, I've been calling the new object "bytes"
>(since it is just a simple array of bytes).  Now that you guys are
>referring to it as the "memory object", should I change the name?  Doesn't
>really matter, but it might avoid confusion to know we're all talking about
>the same thing.
>
Calling this a memory type  sounds the best to me.  The question I have 
not resolved for myself
is whether there should be one type which "does it all" or two types, a 
memory allocator and a bufferable
object manipulator.  

>
>
>
>__________________________________________________
>Do You Yahoo!?
>Yahoo! Autos - Get free new car price quotes
>http://autos.yahoo.com
>