On Friday, Apr 26, 2019 at 10:31 AM, Ralf Gommers <ralf.gommers@gmail.com> wrote:

On Fri, Apr 26, 2019 at 1:02 AM Stephan Hoyer <shoyer@gmail.com> wrote:

On Thu, Apr 25, 2019 at 3:39 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:

On Fri, Apr 26, 2019 at 12:04 AM Stephan Hoyer <shoyer@gmail.com> wrote:
I do like the look of this, but keep in mind that there is a downside to exposing the implementation of NumPy functions -- now the implementation details become part of NumPy's API. I suspect we do not want to commit ourselves to never changing the implementation of NumPy functions, so at the least this will need careful disclaimers about non-guarantees of backwards compatibility.

I honestly still am missing the point of claiming this. There is no change either way to what we've done for the last decade. If we change anything in the numpy implementation of any function, we use deprecation warnings etc. What am I missing here?

Hypothetically, wuppose we rewrite np.stack() in terms of np.block() instead of np.concatenate(), because it turns out it is faster.

As long as we've coercing with np.asarray(), users don't notice any material difference -- their code just gets a little faster.

But this could be problematic if we support duck typing. For example, I support dask arrays rely on NumPy's definition of np.stack in terms of np.concatenate, but they never bothered to implement np.block. Now upgrading NumPy breaks dask.

Thanks, this helped clarify what's going on here. This example is clear. The problem seems to be that there's two separate discussions in this thread:

1. your original proposal, __numpy_implementation__. it does not have the problem of your np.concatenate example, as the "numpy implementation" is exactly the same as it is today.

2. splitting up the current numpy implementation into *multiple* entry points. this can be with and without coercion, with and without checking for invalid values etc.

So far NEP 18 does (1). Your proposed __numpy_implementation__ addition to NEP 18 is still (1). Claiming that this affects the situation with respect to backwards compatibility is incorrect.

(2) is actually a much more invasive change, and one that does much more to increase the size of the NumPy API surface. And yes, affects our backwards compatibility situation as well.

Also note that these have very different purposes:

(1) was to (quoting from the NEP) "allow using NumPy as a high level API for efficient multi-dimensional array operations, even with array implementations that differ greatly from numpy.ndarray."

(2) is for making duck arrays work with numpy implementations of functions (not just with the NumPy API)

I think (1) is mostly achieved, and I'm +1 on your NEP addition for that. (2) is quickly becoming a mess, and I agree with Nathaniel's sentiment above "I shouldn't expect __array_function__ to be useful for duck arrays?". For (2) we really need to go back and have a well thought out design. Hameer's mention of uarray could be that. Growing more __array_*__ protocols in a band-aid fashion seems unlikely to get us there.

This is basically the same reason why subclass support has been hard to maintain in NumPy. Apparently safe internal changes to NumPy functions can break other array types in surprising ways, even if they do not intentionally deviate from NumPy's semantics.

Agreed. Therefore optionally skipping asarray & co is a separate discussion. That's part of the problem caused by numpy trying to be both a library and an end user interface - and often those goals conflict.

Cheers,

Ralf

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion