Nathaniel,

Thanks for raising these thoughtful concerns. Your independent review of this proposal is greatly appreciated!

See my responses inline below:

On Mon, Aug 13, 2018 at 2:44 AM Nathaniel Smith <njs@pobox.com> wrote:
The other approach would be to incrementally add clean, well-defined
dunder methods like __array_ufunc__, __array_concatenate__, etc. This
way we end up putting some thought into each interface, making sure
that it's something we can support, protecting downstream libraries
from unnecessary complexity (e.g. they can implement
__array_concatenate__ instead of hstack, vstack, row_stack,
column_stack, ...), or avoiding adding new APIs entirely (e.g., by
converting existing functions into ufuncs so __array_ufunc__ starts
automagically working). And in the end we get a clean list of dunder
methods that new array container implementations have to define. It's
plausible to imagine a generic test suite for array containers. (I
suspect that every library that tries to implement __array_function__
will end up with accidental behavioral differences, just because the
numpy API is so vast and contains so many corner cases.) So the
clean-well-defined-dunders approach has lots of upsides. The big
downside is that this is a much longer road to go down.

RE: accidental differences in behavior:

I actually think that the __array_function__ approach is *less* prone to accidental differences in behavior, because we require implementing every function directly (or it raises an error).

This avoids a classic subclassing problem that has plagued NumPy for years, where overriding the behavior of method A causes apparently unrelated method B to break, because it relied on method A internally. In NumPy, this constrained our implementation of np.median(), because it needed to call np.mean() in order for subclasses implementing units to work properly.

There will certainly be accidental differences in behavior for third-party code that *uses* NumPy, but this is basically inevitable for any proposal to allow's NumPy's public API to be overloaded. It's also avoided by default by third-party libraries that follow the current best practice of casting all input arrays with np.asarray().

--------------

RE: a hypothetical simplified interface:

The need to implement everything you want to use in NumPy's public API could certainly be onerous, but on the other hand there are a long list of projects that have already done this today -- and these are the projects that most need __array_function__.

I'm sure there are cases were simplification would be warranted, but in particular I don't think __array_concatenate__ has significant advantages over simply implementing __array_function__ for np.concatenate. It's a slightly different way of spelling, but it basically does the same thing. The level of complexity to implement hstack, vstack, row_stack and column_stack in terms of np.concatenate is pretty minimal. __array_function__ implementors could easily copy and paste code from NumPy or use a third-party helpers library (like NDArrayOperatorsMixin) that provides such implementations.

I also have other concerns about the "simplified API" approach beyond the difficulty of figuring it out, but those are already mentioned in the NEP:
http://www.numpy.org/neps/nep-0018-array-function-protocol.html#implementations-in-terms-of-a-limited-core-api

But... this is wishful thinking. No matter what the NEP says, I simply
don't believe that we'll actually go break dask, sparse arrays,
xarray, and sklearn in a numpy point release. Or any numpy release.
Nor should we. If we're serious about keeping this experimental – and
I think that's an excellent idea for now! – then IMO we need to do
something more to avoid getting trapped by backwards compatibility.

I agree, but to be clear, development for dask, sparse and xarray (and even broadly supported machine learning libraries like TensorFlow) still happens at a much faster pace than is currently the case for "core" projects in the SciPy stack like NumPy. It would not be a big deal to encounter breaking changes in a "major" NumPy release (i.e., 1.X -> 1.(X+1)).

(Side note: sklearn doesn't directly implement any array types, so I don't think it would make use of __array_function__ in any way, except possibly to implement overloadable functions.)
 
My suggestion: at numpy import time, check for an envvar, like say
NUMPY_EXPERIMENTAL_ARRAY_FUNCTION=1. If it's not set, then all the
__array_function__ dispatches turn into no-ops. This lets interested
downstream libraries and users try this out, but makes sure that we
won't have a hundred thousand end users depending on it without
realizing.  
 
- makes it easy for end-users to check how much overhead this adds (by
running their code with it enabled vs disabled)
- if/when we decide to commit to supporting it for real, we just
remove the envvar.

I'm slightly concerned that the cost of reading an environment variable with os.environ could exaggerate the performance cost of __array_function__. It takes about 1 microsecond to read an environment variable on my laptop, which is comparable to the full overhead of __array_function__. So we may want to switch to an explicit Python API instead, e.g., np.enable_experimental_array_function().

My bigger concern is when/how we decide to graduate __array_function__ from requiring an explicit opt-in. We don't need to make a final decision now, but it would be good to clear about what specifically we are waiting for.

I see three types of likely scenarios for changing __array_function__:
1. We decide that the overloading the NumPy namespace in general is a bad idea, based on either performance or predictability consequences for third-party libraries. In this case, I imagine we would probably keep __array_function__, but revert to a separate namespace for explicitly overloaded functions, e.g., numpy.api.
2. We want to keep __array_function__, but need a breaking change to the interface (and we're really attached to keeping the name __array_function__).
3. We decide that specific functions should use a different interface (e.g., switch from __array_function__ to __array_ufunc__).

(1) and (2) are the sort of major concerns that in my mind would warrant hiding a feature behind an experimental flag. For the most part, I expect (1) could be resolved relatively quickly by running benchmark suites after we have a working version of __array_function__. To be honest, I don't see either of these rollback scenarios as terribly likely, but the downside risk is large enough that we may want to protect ourselves for a major release or two (6-12 months).

(3) will be a much longer process, likely to stretch out over years at the current pace of NumPy development. I don't think we'll want to keep an opt-in flag for this long of a period. Rather, we may want to accept a shorter deprecation cycle than usual. In most cases, I suspect we could incrementally switch to new overloads while preserving the __array_function__ overload for a release or two.

I don't really understand the 'types' frozenset. The NEP says "it will
be used by most __array_function__ methods, which otherwise would need
to extract this information themselves"... but they still need to
extract the information themselves, because they still have to examine
each object and figure out what type it is. And, simply creating a
frozenset costs ~0.2 µs on my laptop, which is overhead that we can't
possibly optimize later...

The most flexible alternative would be to just say that we provide an fixed-length iterable, and return a tuple object. (In my microbenchmarks, it's faster to make a tuple than a list or set.) In an early draft of the NEP, I proposed exactly this, but the speed difference seemed really marginal to me.

I included 'types' in the interface because I really do think it's something that almost all __array_function__ implementations should use use. It preserves a nice separation of concerns between dispatching logic and implementations for a new type. At least as long as __array_function__ is experimental, I don't think we should be encouraging people to write functions that could return NotImplemented directly and to rely entirely on the NumPy interface.

Many but not all implementations will need to look at argument types. This is only really essential for cases where mixed operations between NumPy arrays and another type are allowed. If you only implement the NumPy interface for MyArray objects, then in the usual Python style you wouldn't need isinstance checks.

It's also important from an ecosystem perspective. If we don't make it easy to get type information, my guess is that many __array_function__ authors wouldn't bother to return NotImplemented for unexpected types, which means that __array_function__ will break in weird ways when used with objects from unrelated libraries.

Cheers,
Stephan