[Python-Dev] buffer object

Guido van Rossum guido@python.org
Sun, 07 May 2000 17:29:43 -0400

[Finn Bock]

> Forgive me for rewinding this to the very beginning. But what is a
> buffer object usefull for? I'm trying think about buffer object in terms
> of jpython, so my primary interest is the user experience of buffer
> objects.
> Please correct my misunderstandings.
> - There is not a buffer protocol exposed to python object (in the way
>   the sequence protocol __getitem__ & friends are exposed).
> - A buffer object typically gives access to the raw bytes which
>   under lays the backing object. Regardless of the structure of the
>   bytes.
> - It is only intended for object which have a natural byte storage to
>   implement the buffer interface.

All true.

> - Of the builtin object only string, unicode and array supports the
>   buffer interface.

And the new mmap module.

> - When slicing a buffer object, the result is always a string regardless
>   of the buffer object base.
> In jpython, only byte arrays like jarrays.array('b', [0,1,2]) can be
> said to have some natural byte storage. The jpython string type doesn't.
> It would take some awful bit shifting to present a jpython string as an
> array of bytes.

I don't recall why JPython has jarray instead of array -- how do they
differ?  I think it's a shame that similar functionality is embodied
in different APIs.

> Would it make any sense to have a buffer object which only accept a byte
> array as base? So that jpython would say:
> >>> buffer("abc")
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> TypeError: buffer object expected
> Would it make sense to tell python users that they cannot depend on the
> portability of using strings (both 8bit and 16bit) as buffer object
> base?

I think that the portability of many string properties is in danger
with the Unicode proposal.  Supporting this in the next version of
JPython will be a bit tricky.

> Because it is so difficult to look at java storage as a sequence of
> bytes, I think I'm all for keeping the buffer() builtin and buffer
> object as obscure and unknown as possible <wink>.

I basically agree, and in a private email to Greg Stein I've told him
this.  I think that the array module should be promoted to a built-in
function/type, and should be the recommended solution for data
storage.  The buffer API should remain a C-level API, and the buffer()
built-in should be labeled with "for experts only".

--Guido van Rossum (home page: http://www.python.org/~guido/)