[Numpy-discussion] Objects exposing the array interface

Wed Feb 25 16:24:42 EST 2015

An issue was raised yesterday in github, regarding np.may_share_memory when
run on a class exposing an array using the __array__ method. You can check
the details here:

https://github.com/numpy/numpy/issues/5604

Looking into it, I found out that NumPy doesn't really treat objects
exposing __array__,, __array_interface__, or __array_struct__ as if they
were proper arrays:

   1. When converting these objects to arrays using PyArray_Converter, if
   the arrays returned by any of the array interfaces is not C contiguous,
   aligned, and writeable, a copy that is will be made. Proper arrays and
   subclasses are passed unchanged. This is the source of the error reported
   above.
   2. When converting these objects using PyArray_OutputConverter, as well
   as in similar code in the ufucn machinery, anything other than a proper
   array or subclass raises an error. This means that, contrary to what the
   docs on subclassing say, see below, you cannot use an object exposing the
   array interface as an output parameter to a ufunc

The following classes can be used to test this behavior:

class Foo:
    def __init__(self, arr):
        self.arr = arr
    def __array__(self):
        return self.arr

class Bar:
    def __init__(self, arr):
        self.arr = arr
        self.__array_interface__ = arr.__array_interface__

class Baz:
    def __init__(self, arr):
        self.arr = arr
        self.__array_struct__ = arr.__array_struct__

They all behave the same with these examples:

>>> a = Foo(np.ones(5))
>>> np.add(a, a)
array([ 2.,  2.,  2.,  2.,  2.])
>>> np.add.accumulate(a)
array([ 1.,  2.,  3.,  4.,  5.])
>>> np.add(a, a, out=a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: return arrays must be of ArrayType
>>> np.add.accumulate(a, out=a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: output must be an array

I think this should be changed, and whatever gets handed by this
methods/interfaces be treated as if it were an array or subclass of it.
This is actually what the docs on subclassing say about __array__ here:

http://docs.scipy.org/doc/numpy/reference/arrays.classes.html#numpy.class.__array__

This also seems to contradict a rather cryptic comment in the code of
PyArray_GetArrayParamsFromObject, which is part of the call sequence of
this whole mess, see here:

https://github.com/numpy/numpy/blob/maintenance/1.9.x/numpy/core/src/multiarray/ctors.c#L1495

/*
 * If op supplies the __array__ function.
 * The documentation says this should produce a copy, so
 * we skip this method if writeable is true, because the intent
 * of writeable is to modify the operand.
 * XXX: If the implementation is wrong, and/or if actual
 *      usage requires this behave differently,
 *      this should be changed!
 */

There has already been some discussion in the issue linked above, but I
would appreciate any other thoughts on the idea of treating objects with
some form of array interface as if they were arrays. Does it need a
deprecation cycle? Is there some case I am not considering where this could
go horribly wrong?

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150225/46f87a81/attachment.html>