[Numpy-discussion] Objects exposing the array interface

Jaime Fernández del Río jaime.frio at gmail.com
Wed Feb 25 17:48:10 EST 2015


On Wed, Feb 25, 2015 at 1:56 PM, Stephan Hoyer <shoyer at gmail.com> wrote:

>
>
> On Wed, Feb 25, 2015 at 1:24 PM, Jaime Fernández del Río <
> jaime.frio at gmail.com> wrote:
>
>> 1. When converting these objects to arrays using PyArray_Converter, if
>> the arrays returned by any of the array interfaces is not C contiguous,
>> aligned, and writeable, a copy that is will be made. Proper arrays and
>> subclasses are passed unchanged. This is the source of the error reported
>> above.
>>
>
>
> When converting these objects to arrays using PyArray_Converter, if the
> arrays returned by any of the array interfaces is not C contiguous,
> aligned, and writeable, a copy that is will be made. Proper arrays and
> subclasses are passed unchanged. This is the source of the error reported
> above.
>
> I'm not entirely sure I understand this -- how is PyArray_Convert used in
> numpy? For example, if I pass a non-contiguous array to your class Foo,
> np.asarray does not do a copy:
>

It is used by many (all?) C functions that take an array as input. This
follows a different path than what np.asarray or np.asanyarray do, which
are calls to np.array, which maps to the C function _array_fromobject which
can be found here:

https://github.com/numpy/numpy/blob/maintenance/1.9.x/numpy/core/src/multiarray/multiarraymodule.c#L1592

And ufuncs have their own conversion code, which doesn't really help
either. Not sure it would be possible to have them all use a common code
base, but it is certainly well worth trying.


>
> In [25]: orig = np.zeros((3, 4))[:2, :3]
>
> In [26]: orig.flags
> Out[26]:
>   C_CONTIGUOUS : False
>   F_CONTIGUOUS : False
>   OWNDATA : False
>   WRITEABLE : True
>   ALIGNED : True
>   UPDATEIFCOPY : False
>
> In [27]: subclass = Foo(orig)
>
> In [28]: np.asarray(subclass)
> Out[28]:
> array([[ 0.,  0.,  0.],
>        [ 0.,  0.,  0.]])
>
> In [29]: np.asarray(subclass)[:] = 1
>
> In [30]: np.asarray(subclass)
> Out[30]:
> array([[ 1.,  1.,  1.],
>        [ 1.,  1.,  1.]])
>
>
> But yes, this is probably a bug.
>
> 2. When converting these objects using PyArray_OutputConverter, as well as
>> in similar code in the ufucn machinery, anything other than a proper array
>> or subclass raises an error. This means that, contrary to what the docs on
>> subclassing say, see below, you cannot use an object exposing the array
>> interface as an output parameter to a ufunc
>>
>
> Here it might be a good idea to distinguish between objects that define
> __array__ vs __array_interface__/__array_struct__. A class that defines
> __array__ might not be very ndarray-like at all, but rather be something
> that can be *converted* to an ndarray. For example, objects in pandas
> define __array__, but updating the return value of df.__array__() in-place
> will not necessarily update the DataFrame (e.g., if the frame had
> inhomogeneous dtypes).
>

I am not really sure what the behavior of __array__ should be. The link to
the subclassing docs I gave before indicates that it should be possible to
write to it if it is writeable (and probably pandas should set the
writeable flag to False if it cannot be reliably written to), but the
obscure comment I mentioned seems to point to the opposite, that it should
never be written to. This is probably a good moment in time to figure out
what the proper behavior should be and document it.

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150225/9a23a2ce/attachment.html>


More information about the NumPy-Discussion mailing list