[Python-Dev] buffer objects

Thu, 2 May 2002 09:46:18 -0700 (PDT)

--- Guido van Rossum <guido@python.org> wrote:
> 
> If you feel that your patch isn't paid enough attention, please do! :-)
>

I just wanted to know how the process worked.  I read the FAQ, but I wasn't
sure if I was supposed to assign it to someone or just wait patiently. 
Here I am talking about it though, so clearly I'm not good at that patience
thing!  :-)

> 
> Are you referring to the buffer object or the buffer interface?
> 

PyBufferObject* as returned by the builtin buffer() function.

>
> The buffer interface doesn't define an object type, it defines a
> particular way to look at an object (just like the numeric, sequence
> and mapping interfaces).  There's no point in prescribing a pickle
> format for it.
>

Agreed completely. 

>
> The buffer object was a mistake.
>

Bummer that you feel this way.  :-(

It looked to me like PyBufferProcs made a small mistake by including
bf_getsegcount() which is mostly a YAGNI in your terms.  I'm sure someone
is actually using the segcount, but they probably wouldn't have missed it
if it didn't exist.  A lot of code seems to assume the segcount is always
1.

I thought the buffer object was on the right track (with a tiny read/write
patch :-).

I guess I'm not getting the Python philosophy of things then.  We have
"string"s which are doubling as readonly-byte-arrays and text-strings, and
I thought the concensus around here was that this was an unfortunate
duality.  Then we have "unicode"s which are clearly just for text-strings. 
Both of these are immutable as per the philosophy that strings are a lot
like numbers.

Then we have buffer objects.  If the buffer object is a mistake, then there
is no endorsed way to get at a (possibly mutable) array of bytes from
Python.  One can use arrays of typecode 'B', but you can't point those at
your own memory, and they don't pickle.

So if someone would only charge up the time machine, I thought it would be
preferrable to only have unicode objects, and buffer objects.  (Possibly
with unicode objects being renamed as strings instead...)

I think you're saying that the only use for PyBufferProcs is from C/C++
extensions.  Because without a PyBufferObject type of thing, you can't
manipulate PyBufferProcs gotten memory from pure Python.

>
> If you want an efficient way of reading/writing memory buffers, look
> at the support for the buffer interface of the file readinto and write
> methods.  You already can read and write arrays without copying.
> 

Yep, but you can't pickle them without copying to a (potentially large)
string first.  We want to use cPickle to pass objects between
processes/machines (across sockets, or through shared memory).  We know
that sometimes (frequently) those objects will have array data mixed in
with dictionaries/lists/....

If arrays/bytes/buffers don't pickle nicely, then do we need to use two
channels?  One for picklable objects, and one for arrays/bytes?  The
algorithm for separating the picklable parts from the non-picklable parts
of a nested data structure would have to be nearly as complicated as pickle
itself.

Cheers,
    -Scott

__________________________________________________
Do You Yahoo!?
Yahoo! Health - your guide to health and wellness
http://health.yahoo.com