On Fri, Sep 27, 2013 at 7:34 PM, Pauli Virtanen
27.09.2013 19:33, Nathaniel Smith kirjoitti: [clip]
I really don't understand what arcane magic is used to make ndarray += csc_matrix work at all, but my question is, is it going to break when we complete the casting transition described above? It was just supposed to catch things like int += float.
This maybe clarifies it:
import numpy import scipy.sparse x = numpy.ones((2,2)) y = scipy.sparse.csr_matrix(x) z = x z += y x array([[ 1., 1.], [ 1., 1.]]) z matrix([[ 2., 2.], [ 2., 2.]])
The execution flows like this:
ndarray.__iadd__(arr, sparr) np.add(arr, sparr, out=???) return NotImplemented # wtf return NotImplemented Python does arr = sparr.__radd__(arr)
Since Scipy master sparse matrices now have __numpy_ufunc__, but it doesn't handle out= arguments, the second step currently raises a TypeError (for Numpy master + Scipy master).
And this is actually the correct thing to do, as having np.add return NotImplemented is just broken. Only ndarray.__iadd__ has the authority to return the NotImplemented.
To make the in-place ops work again, it seems Numpy needs some additional fixes in its binary op machinery, before __numpy_ufunc__ business works fully as intended. Namely, the binary op routines will need to catch TypeErrors and convert them to NotImplemented.
The code paths where Numpy ufuncs currently return NotImplemented could also be changed to raise TypeErrors, but I'm not sure if someone somewhere relies on this behavior (I hope not).
Okay, so I see three separate issues: 1) My original concern, that the upcoming casting change for in-place operations will cause some horrible interaction. Tentatively this seems like it might be okay since even after the "cast" succeeds, np.add is still just refusing to do the operation, so hopefully we can set it up so that it will continue to fail once the casting rule becomes more strict. 2) The issue that ufuncs return NotImplemented and it makes baby Guido cry. This is completely broken, agreed. Not sure when someone will get around to clearing this stuff up. 3) The issue of how to make an in-place like ndarray += sparse continue to work in the brave new __numpy_ufunc__ world. For this last issue, I think we disagree. It seems to me that the right answer is that csc_matrix.__numpy_ufunc__ needs to step up and start supporting out=! If I have a large dense ndarray and I try to += a sparse array to it, this operation should take no temporary memory and nnz time. Right now it sounds like it actually copies the large dense ndarray, which takes time and space proportional to its size. AFAICT the only way to avoid that is for scipy.sparse to implement out=. It shouldn't be that hard...? -n