[Python-Dev] The buffer interface

Greg Stein gstein@lyra.org
Mon, 16 Oct 2000 18:57:09 -0700

On Mon, Oct 16, 2000 at 01:22:22PM -0700, Jeff Collins wrote:
> I think that buffer object is fairly important.  They provide a mechanism
> for exposing arbitrary chunks of memory (eg, PyBuffer_FromMemory),
> something that no other python object does, AFIAK.  Perhaps clarifying the
> interface (such as the slice operator returning a buffer, as suggested
> below) and providing more hooks from Python for creating buffers (via
> newmodule, say) would be helpful.

There have been quite a few C extensions (and embedding Python!) where the
buffer objects have been used in this fashion. For example, if you have a
string argument that you wish to pass into Python, then you can avoid a copy
by wrapping a Buffer Object around it and passing that.

Many of the issues with the buffer object can be solved with simple changes.
For example, the "mutable object" thing is easily dealt with by having the
object not record the pointer, but just fetch it every time that it wants to
do an operation.
[ and if we extend the buffer API, we could potentially optimize the
  behavior to avoid the ptr refetch on each operation ]

I don't recall the motivation for returning strings. I believe it was based
on an attempt to make the buffer look as much like a string as possible (and
slices and concats return strings). That was a poor choice :-)  ... so,
again, some basic changes to return slices and concats as buffer objects
would make sense.

Extending the buffer() builtin to create writeable buffer objects has been a
reasonably common request. What seems to happen instead is that people
developing C extensions (which desire buffer objects as their params) just
add a new function to the extension to create buffer objects.

Re: the buffer API: At the time the "s"/"t" codes were introduced (before
1.5.2 was released), we had a very different concept of how Unicode objects
would be implemented. At that time, Unicode objects had no 8-bit
representation (just 16-bit chars), so the difference was important. I'm not
clued in enough on the ramifications of torching the difference in the API,
but it would be a nice simplification.

Buffers vs arrays: this is a harder question. Which is the "recommended
binary type [for series of bytes]" ? Buffers can refer to arbitrary memory.
Arrays maintain their own memory. I believe the two models are needed, so
I'd initially offer that both buffers and arrays need to be maintained.
However, given that... what is the purpose of the array if a buffer can
*also* maintain its own memory?


Greg Stein, http://www.lyra.org/