[Python-3000] pre-PEP: Enhancing buffer protocol (tp_as_buffer)
Travis Oliphant
oliphant.travis at ieee.org
Mon Feb 26 21:37:32 CET 2007
Guido van Rossum wrote:
> On 2/26/07, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>
>>Guido van Rossum wrote:
>>
>>>I realized this thinking about the 3.0 bytes object, but the 2.x array
>>>object has the same problems, and probably every other object that
>>>uses the buffer API and has a mutable size (if there are any).
>>
>>Yes, the NumPy object has this problem as well (although it has *very*
>>conservative checks so that if the reference count on the array is not
>>1, memory is not reallocated).
>
>
> That would be *too* conservative for me -- just passing it as an
> argument to another function increfs it (for the duration of the
> call).
>
It's too conservative for us to. We just don't see anyway around it
without the locking mechanism (right now you can over-ride the ref-count
checking if you know what you are doing).
>>
>>I'm not sure what this mixin class is. Is this a base class for the
>>bytes object? I need to understand this better in order to write a PEP.
>
>
> Yes, that's a good way to describe it.
>
>
>>>- *Another* API built on top of the redesigned buffer API would be
>>>something more aligned with numpy's needs, adding (a) a shape
>>>descriptor indicating the size, offset and stride of each dimension,
>>>and (b) a record descriptor indicating the interpretation of one
>>>element of the array. For (a), a list of 3-tuples of ints would
>>>probably be sufficient (constrained so that no valid combination of
>>>indexes points outside the buffer); for (b), I propose (with Jim
>>>Hugunin who first suggested this at PyCon) to use the same concise but
>>>expressing format-string-like notation used by the struct module. (The
>>>bytes API is not quite a special case of this, since it provides more
>>>string-like operations.)
>>
>>Great. NumPy has already adopted the struct standard for it's "hidden"
>>character codes.
>
>
> Glad to get agreement.
>
>
>>We also need to add some format codes for complex-data ('F','D','G') and
>>for long doubles ('g').
>
>
> No problem. Just make this a separate section in your PEP ("proposed
> additions for the struct module").
>
O.K. great.
>
>>I would also propose that we make an
>>enumeration in Python so we can refer to these codes in C/C++ as constants:
>>
>>PYFORMAT_LONG
>>PYFORMAT_UINT
>>
>>etc.
>
>
> Not sure I follow but sounds fine; hopefully the PEP draft will clarify this.
>
This is just some header magic (either defines or an enum statement so
you don't have to remember character codes in C/C++).
>
>>a) I would prefer a 3-tuple of lists for the shape descriptor
>>(shape list, stride list, offset list)
>>
>>That way default striding could be given as None and there would not
>>have to be any offset as well.
>
>
> Of course. I don't know much about the traditional way of representing
> MD array structure.
>
>
>>My view on the offset is that it is not necessary as the start of the
>>array is already given by the memory pointer. But, if others see a
>>strong need for it, I have no problem with including it.
>
>
> Well don't you end up with an offset as soon as you take a rectangular
> slice out of a 2d array?
You can either 1) keep the same base memory pointer and create an offset
list, or 2) have no offset and change the starting memory pointer.
NumPy uses option 2 (it stores the starting point of the array).
>
>
>>b) I'm also fine with just returning a string for the record descriptor
>>like the struct module uses.
>
>
> Excellent. Are we all set then?
I think so. I have some additional ideas about the string format
description that I will explain in the PEP. The draft is coming along at
http://wiki.python.org/moin/ArrayInterface
Feel free to make changes there.
-Travis
More information about the Python-3000
mailing list