
On Fri, Sep 6, 2019 at 11:53 AM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Fri, Sep 6, 2019 at 12:53 AM Nathaniel Smith <njs@pobox.com> wrote:
On Tue, Sep 3, 2019 at 2:04 AM Hameer Abbasi <einstein.edison@gmail.com> wrote:
The fact that we're having to design more and more protocols for a lot of very similar things is, to me, an indicator that we do have holistic problems that ought to be solved by a single protocol.
But the reason we've had trouble designing these protocols is that they're each different :-). If it was just a matter of copying __array_ufunc__ we'd have been done in a few minutes...
I don't think that argument is correct. That we now have two very similar protocols is simply a matter of history and limited developer time. NEP 18 discusses in several places that __array_ufunc__ should be brought in line with __array_ufunc__, and that we can migrate a function from one protocol to the other. There's no technical reason other than backwards compat and dev time why we couldn't use __array_function__ for ufuncs also.
Huh, that's interesting! Apparently we have a profoundly different understanding of what we're doing here. To me, __array_ufunc__ and __array_function__ are completely different. In fact I'd say __array_ufunc__ is a good idea and __array_function__ is a bad idea, and would definitely not be in favor of combining them together. The key difference is that __array_ufunc__ allows for *generic* implementations. Most duck array libraries can write a single implementation of __array_ufunc__ that works for *all* ufuncs, even new third-party ufuncs that the duck array library has never heard of, because ufuncs all share the same structure of a loop wrapped around a core operation, and they can treat the core operation as a black box. For example: - Dask can split up the operation across its tiled sub-arrays, and then for each tile it invokes the core operation. - xarray can do its label-based axis matching, and then invoke the core operation. - bcolz can loop over the array uncompressing one block at a time, invoking the core operation on each. - sparse arrays can check the ufunc .identity attribute to find out whether 0 is an identity, and if so invoke the operation directly on the non-zero entries; otherwise, it can loop over the array and densify it in blocks and invoke the core operation on each. (It would be useful to have a bit more metadata on the ufunc, so e.g. np.subtract could declare that zero is a right-identity but not a left-identity, but that's a simple enough extension to make at some point.) Result: __array_ufunc__ makes it totally possible to take a ufunc from scipy.special or a random new on created with numba, and have it immediately work on an xarray wrapped around dask wrapped around bcolz, out-of-the-box. That's a clean, generic interface. [1] OTOH, __array_function__ doesn't allow this kind of simplification: if we were using __array_function__ for ufuncs, every library would have to special-case every individual ufunc, which leads to dramatically more work and more potential for bugs. To me, the whole point of interfaces is to reduce coupling. When you have N interacting modules, it's unmaintainable if every change requires considering every N! combination. From this perspective, __array_function__ isn't good, but it is still somewhat constrained: the result of each operation is still determined by the objects involved, nothing else. In this regard, uarray even more extreme than __array_function__, because arbitrary operations can be arbitrarily changed by arbitrarily distant code. It sort of feels like the argument for uarray is: well, designing maintainable interfaces is a lot of work, so forget it, let's just make it easy to monkeypatch everything and call it a day. That said, in my replies in this thread I've been trying to stay productive and focus on narrower concrete issues. I'm pretty sure that __array_function__ and uarray will turn out to be bad ideas and will fail, but that's not a proven fact, it's just an informed guess. And the road that I favor also has lots of risks and uncertainty. So I don't have a problem with trying both as experiments and learning more! But hopefully that explains why it's not at all obvious that uarray solves the protocol design problems we've been talking about. -n [1] There are also some cases that __array_ufunc__ doesn't handle as nicely. One obvious one is that GPU/TPU libraries still need to special-case individual ufuncs. But that's not a limitation of __array_ufunc__, it's a limitation of GPUs – they can't run CPU code, so they can't use the CPU implementation of the core operations. Another limitation is that __array_ufunc__ is weak at handling operations that involve mixed libraries (e.g. np.add(bcolz_array, sparse_array)) – to work well, this might require that bcolz have special-case handling for sparse arrays, or vice-versa, so you still potentially have some N**2 special cases, though at least here N is the number of duck array libraries, not the number of ufuncs. I think this is an interesting target for future work. But in general, __array_ufunc__ goes a long way to taming the complexity of interacting libraries and ufuncs. -- Nathaniel J. Smith -- https://vorpus.org