[Finn Bock]
Forgive me for rewinding this to the very beginning. But what is a buffer object usefull for? I'm trying think about buffer object in terms of jpython, so my primary interest is the user experience of buffer objects.
Please correct my misunderstandings.
- There is not a buffer protocol exposed to python object (in the way the sequence protocol __getitem__ & friends are exposed). - A buffer object typically gives access to the raw bytes which under lays the backing object. Regardless of the structure of the bytes. - It is only intended for object which have a natural byte storage to implement the buffer interface.
All true.
- Of the builtin object only string, unicode and array supports the buffer interface.
And the new mmap module.
- When slicing a buffer object, the result is always a string regardless of the buffer object base.
In jpython, only byte arrays like jarrays.array('b', [0,1,2]) can be said to have some natural byte storage. The jpython string type doesn't. It would take some awful bit shifting to present a jpython string as an array of bytes.
I don't recall why JPython has jarray instead of array -- how do they differ? I think it's a shame that similar functionality is embodied in different APIs.
Would it make any sense to have a buffer object which only accept a byte array as base? So that jpython would say:
buffer("abc") Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: buffer object expected
Would it make sense to tell python users that they cannot depend on the portability of using strings (both 8bit and 16bit) as buffer object base?
I think that the portability of many string properties is in danger with the Unicode proposal. Supporting this in the next version of JPython will be a bit tricky.
Because it is so difficult to look at java storage as a sequence of bytes, I think I'm all for keeping the buffer() builtin and buffer object as obscure and unknown as possible <wink>.
I basically agree, and in a private email to Greg Stein I've told him this. I think that the array module should be promoted to a built-in function/type, and should be the recommended solution for data storage. The buffer API should remain a C-level API, and the buffer() built-in should be labeled with "for experts only". --Guido van Rossum (home page: http://www.python.org/~guido/)