NEP 30  Duck Typing for NumPy Arrays  Implementation
Hi,
we have a new proposal for the implementation of NumPy array duck typing [1] [2], following the highlevel overview described in NEP22 [3].
Would be great to get some comments on that.
[1] https://github.com/numpy/numpy/blob/master/doc/neps/nep0030duckarrayprot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep0022ndarrayducktypingoverview.html
Best, Peter
Hi Peter, thanks for writing that up!
On Mon, Aug 5, 2019 at 8:07 AM Peter Andreas Entschev peter@entschev.com wrote:
Hi,
we have a new proposal for the implementation of NumPy array duck typing [1] [2], following the highlevel overview described in NEP22 [3].
A couple of high level comments:
Having __array__ give a TypeError is fine for libraries that want to prevent unintentional coercion with, e.g., `np.asarray(my_ducktype)`. However that leaves the obvious question of what the right way is to do this intentionally. Would be good to recommend something, for example a `numpy()` or `to_numpy()` method. Also, the NEP should make it clearer that this is not the obviously correct thing to do, it only makes sense in cases where coercion is very expensive, like CuPy and Sparse. For Dask for example, coercion to a numpy array is perfectly reasonable.
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
Cheers, Ralf
Would be great to get some comments on that.
[1] https://github.com/numpy/numpy/blob/master/doc/neps/nep0030duckarrayprot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep0022ndarrayducktypingoverview.html
Best, Peter _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers ralf.gommers@gmail.com wrote:
Having __array__ give a TypeError is fine for libraries that want to prevent unintentional coercion with, e.g., `np.asarray(my_ducktype)`. However that leaves the obvious question of what the right way is to do this intentionally. Would be good to recommend something, for example a `numpy()` or `to_numpy()` method. Also, the NEP should make it clearer that this is not the obviously correct thing to do, it only makes sense in cases where coercion is very expensive, like CuPy and Sparse. For Dask for example, coercion to a numpy array is perfectly reasonable.
I agree, we need another solution for explicit array conversion, either from duck arrays to NumPy arrays or between duck arrays.
As has comeup on GitHub [1], think this should probably be another protocol, to allow for thirdparty conversions like sparse <> dask that in principle could be implemented by either library.
To get discussion start, here's one possible proposal for what the NumPy API(s) could look like: np.coerce(sparse_array) # by default, coerce to np.ndarray np.coerce(sparse_array, dask.array.Array) # coerces to dask np.coerce_like(sparse_array, dask_array) # coerce like the second array type np.coerce_arrays(list_of_arrays) # coerce to first type that can handle everything
The protocol itself should probably either use __array_function__ (e.g., for np.coerce_like, if all the dispatched on arguments are arrays) or a custom protocol in the same style that allows for implementations on either the array being converted or the type of the result [2].
[1] https://github.com/numpy/numpy/issues/13831 [2] https://github.com/numpy/numpy/pull/14170#issuecomment517004293
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP22 already. As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
An alternative to introducing np.duckarray() would be to just modify
np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
Cheers, Ralf
Would be great to get some comments on that.
[1] https://github.com/numpy/numpy/blob/master/doc/neps/nep0030duckarrayprot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep0022ndarrayducktypingoverview.html
Best, Peter _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
Thanks for the concerns raised, and Stephan for promptly answering them.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
I don't know if mentioning alternatives in that way is good, it gives me the impression that NumPy is encouraging them (unless it really is). In that sense, I think the best is indeed going down the path of finding a good coercion solution (as Stephan already mentioned) and later we could just add a pointer to the new NEP in this one.
Best, Peter
On Tue, Aug 6, 2019 at 3:17 AM Stephan Hoyer shoyer@gmail.com wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers ralf.gommers@gmail.com wrote:
Having __array__ give a TypeError is fine for libraries that want to prevent unintentional coercion with, e.g., `np.asarray(my_ducktype)`. However that leaves the obvious question of what the right way is to do this intentionally. Would be good to recommend something, for example a `numpy()` or `to_numpy()` method. Also, the NEP should make it clearer that this is not the obviously correct thing to do, it only makes sense in cases where coercion is very expensive, like CuPy and Sparse. For Dask for example, coercion to a numpy array is perfectly reasonable.
I agree, we need another solution for explicit array conversion, either from duck arrays to NumPy arrays or between duck arrays.
As has comeup on GitHub [1], think this should probably be another protocol, to allow for thirdparty conversions like sparse <> dask that in principle could be implemented by either library.
To get discussion start, here's one possible proposal for what the NumPy API(s) could look like: np.coerce(sparse_array) # by default, coerce to np.ndarray np.coerce(sparse_array, dask.array.Array) # coerces to dask np.coerce_like(sparse_array, dask_array) # coerce like the second array type np.coerce_arrays(list_of_arrays) # coerce to first type that can handle everything
The protocol itself should probably either use __array_function__ (e.g., for np.coerce_like, if all the dispatched on arguments are arrays) or a custom protocol in the same style that allows for implementations on either the array being converted or the type of the result [2].
[1] https://github.com/numpy/numpy/issues/13831 [2] https://github.com/numpy/numpy/pull/14170#issuecomment517004293
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP22 already. As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
Cheers, Ralf
Would be great to get some comments on that.
[1] https://github.com/numpy/numpy/blob/master/doc/neps/nep0030duckarrayprot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep0022ndarrayducktypingoverview.html
Best, Peter _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
On Tue, 20190806 at 10:24 +0200, Peter Andreas Entschev wrote:
Thanks for the concerns raised, and Stephan for promptly answering them.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
I don't know if mentioning alternatives in that way is good, it gives me the impression that NumPy is encouraging them (unless it really is).
Well, if you think alternatives is too open to actually using them, I think it would be fine to list them as "Rejected Alternatives"?
 Sebastian
In that sense, I think the best is indeed going down the path of finding a good coercion solution (as Stephan already mentioned) and later we could just add a pointer to the new NEP in this one.
Best, Peter
On Tue, Aug 6, 2019 at 3:17 AM Stephan Hoyer shoyer@gmail.com wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com
wrote: Having __array__ give a TypeError is fine for libraries that want to prevent unintentional coercion with, e.g., `np.asarray(my_ducktype)`. However that leaves the obvious question of what the right way is to do this intentionally. Would be good to recommend something, for example a `numpy()` or `to_numpy()` method. Also, the NEP should make it clearer that this is not the obviously correct thing to do, it only makes sense in cases where coercion is very expensive, like CuPy and Sparse. For Dask for example, coercion to a numpy array is perfectly reasonable.
I agree, we need another solution for explicit array conversion, either from duck arrays to NumPy arrays or between duck arrays.
As has comeup on GitHub [1], think this should probably be another protocol, to allow for thirdparty conversions like sparse <> dask that in principle could be implemented by either library.
To get discussion start, here's one possible proposal for what the NumPy API(s) could look like: np.coerce(sparse_array) # by default, coerce to np.ndarray np.coerce(sparse_array, dask.array.Array) # coerces to dask np.coerce_like(sparse_array, dask_array) # coerce like the second array type np.coerce_arrays(list_of_arrays) # coerce to first type that can handle everything
The protocol itself should probably either use __array_function__ (e.g., for np.coerce_like, if all the dispatched on arguments are arrays) or a custom protocol in the same style that allows for implementations on either the array being converted or the type of the result [2].
[1] https://github.com/numpy/numpy/issues/13831 [2] https://github.com/numpy/numpy/pull/14170#issuecomment517004293
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP22 already. As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
Cheers, Ralf
Would be great to get some comments on that.
[1] https://github.com/numpy/numpy/blob/master/doc/neps/nep0030duckarrayprot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep0022ndarrayducktypingoverview.html
Best, Peter _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
Sure, I wouldn't mind doing that, but it would also be better to have clear alternative/complement to duck array (as I'm hoping to be the case with with coerce). I will try to give a bit more thought on the coercion ideas and start writing a NEP for that this week and the next. Perhaps we can then sync both NEPs.
On Tue, Aug 6, 2019 at 4:23 PM Sebastian Berg sebastian@sipsolutions.net wrote:
On Tue, 20190806 at 10:24 +0200, Peter Andreas Entschev wrote:
Thanks for the concerns raised, and Stephan for promptly answering them.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
I don't know if mentioning alternatives in that way is good, it gives me the impression that NumPy is encouraging them (unless it really is).
Well, if you think alternatives is too open to actually using them, I think it would be fine to list them as "Rejected Alternatives"?
 Sebastian
In that sense, I think the best is indeed going down the path of finding a good coercion solution (as Stephan already mentioned) and later we could just add a pointer to the new NEP in this one.
Best, Peter
On Tue, Aug 6, 2019 at 3:17 AM Stephan Hoyer shoyer@gmail.com wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com
wrote: Having __array__ give a TypeError is fine for libraries that want to prevent unintentional coercion with, e.g., `np.asarray(my_ducktype)`. However that leaves the obvious question of what the right way is to do this intentionally. Would be good to recommend something, for example a `numpy()` or `to_numpy()` method. Also, the NEP should make it clearer that this is not the obviously correct thing to do, it only makes sense in cases where coercion is very expensive, like CuPy and Sparse. For Dask for example, coercion to a numpy array is perfectly reasonable.
I agree, we need another solution for explicit array conversion, either from duck arrays to NumPy arrays or between duck arrays.
As has comeup on GitHub [1], think this should probably be another protocol, to allow for thirdparty conversions like sparse <> dask that in principle could be implemented by either library.
To get discussion start, here's one possible proposal for what the NumPy API(s) could look like: np.coerce(sparse_array) # by default, coerce to np.ndarray np.coerce(sparse_array, dask.array.Array) # coerces to dask np.coerce_like(sparse_array, dask_array) # coerce like the second array type np.coerce_arrays(list_of_arrays) # coerce to first type that can handle everything
The protocol itself should probably either use __array_function__ (e.g., for np.coerce_like, if all the dispatched on arguments are arrays) or a custom protocol in the same style that allows for implementations on either the array being converted or the type of the result [2].
[1] https://github.com/numpy/numpy/issues/13831 [2] https://github.com/numpy/numpy/pull/14170#issuecomment517004293
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP22 already. As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
Cheers, Ralf
Would be great to get some comments on that.
[1] https://github.com/numpy/numpy/blob/master/doc/neps/nep0030duckarrayprot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep0022ndarrayducktypingoverview.html
Best, Peter _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer shoyer@gmail.com wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers ralf.gommers@gmail.com wrote:
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy  so it's new libraries first, mature libraries after the dust has settled a bit (I think).
As discussed there, I don't think NumPy is in a good position to pronounce
decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
Cheers, Ralf
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers ralf.gommers@gmail.com wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer shoyer@gmail.com wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers ralf.gommers@gmail.com wrote:
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy  so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree  we definitely should clarify this in the docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" (https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
As discussed there, I don't think NumPy is in a good position to pronounce
decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well  some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming: 1. We should have a brief note on why we settled on the name "duck array". Namely, as discussed in NEP22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes. 2. The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer shoyer@gmail.com wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers ralf.gommers@gmail.com wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer shoyer@gmail.com wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers ralf.gommers@gmail.com wrote:
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy  so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree  we definitely should clarify this in the docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" (https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
As discussed there, I don't think NumPy is in a good position to
pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well  some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming:
 We should have a brief note on why we settled on the name "duck array".
Namely, as discussed in NEP22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes. 2. The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ?
Chuck
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris charlesr.harris@gmail.com wrote:
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer shoyer@gmail.com wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers ralf.gommers@gmail.com wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer shoyer@gmail.com wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers ralf.gommers@gmail.com wrote:
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy  so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree  we definitely should clarify this in the docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" (https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
As discussed there, I don't think NumPy is in a good position to
pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well  some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming:
 We should have a brief note on why we settled on the name "duck
array". Namely, as discussed in NEP22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes. 2. The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more selfdescriptive than "duck array".
Chuck _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested on clarifying the text. After reading the detailed description, I've decided to add a subsection "Scope" to clarify the scope where NEP30 would be useful. I think the inclusion of this new subsection complements the "Detail description" forming a complete text w.r.t. motivation of the NEP, but feel free to point out disagreements with my suggestion. I've also added a new section "Usage" pointing out how one would use duck array in replacement to np.asarray where relevant.
Regarding the naming discussion, I must say I like the idea of keeping the __array_ prefix, but it seems like that is going to be difficult given that none of the existing ideas so far play very nicely with that. So if the general consensus is to go with __numpy_like__, I would also update the NEP to reflect that changes. FWIW, I particularly neither like nor dislike __numpy_like__, but I don't have any better suggestions than that or keeping the current naming.
Best, Peter
On Thu, Aug 8, 2019 at 3:40 AM Stephan Hoyer shoyer@gmail.com wrote:
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris charlesr.harris@gmail.com wrote:
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer shoyer@gmail.com wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers ralf.gommers@gmail.com wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer shoyer@gmail.com wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers ralf.gommers@gmail.com wrote:
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy  so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree  we definitely should clarify this in the docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" (https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well  some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming:
 We should have a brief note on why we settled on the name "duck array". Namely, as discussed in NEP22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes.
 The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more selfdescriptive than "duck array".
Chuck _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
Trivial note:
On the subject of naming things (spelling things??)  should it be:
numpy or Numpy or NumPy ?
All three are in the draft NEP 30 ( mostly "NumPy", I noticed this when reading/copy editing the NEP) . Is there an "official" capitalization?
My preference, would be to use "numpy", and where practicable, use a "computer" font  i.e. ``numpy`` in RST.
But if there is consensus already for anything else, that's fine, I'd just like to know what it is.
CHB
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev peter@entschev.com wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested on clarifying the text. After reading the detailed description, I've decided to add a subsection "Scope" to clarify the scope where NEP30 would be useful. I think the inclusion of this new subsection complements the "Detail description" forming a complete text w.r.t. motivation of the NEP, but feel free to point out disagreements with my suggestion. I've also added a new section "Usage" pointing out how one would use duck array in replacement to np.asarray where relevant.
Regarding the naming discussion, I must say I like the idea of keeping the __array_ prefix, but it seems like that is going to be difficult given that none of the existing ideas so far play very nicely with that. So if the general consensus is to go with __numpy_like__, I would also update the NEP to reflect that changes. FWIW, I particularly neither like nor dislike __numpy_like__, but I don't have any better suggestions than that or keeping the current naming.
Best, Peter
On Thu, Aug 8, 2019 at 3:40 AM Stephan Hoyer shoyer@gmail.com wrote:
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris <
charlesr.harris@gmail.com> wrote:
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer shoyer@gmail.com wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers ralf.gommers@gmail.com
wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer shoyer@gmail.com
wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers ralf.gommers@gmail.com
wrote:
> > The NEP currently does not say who this is meant for. Would you
expect libraries like SciPy to adopt it for example?
> > The NEP also (understandably) punts on the question of when
something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP22 already.
It's not really. We discussed this briefly in the community call
today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is
supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy  so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree  we definitely should clarify this in the docstring
and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" ( https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
As discussed there, I don't think NumPy is in a good position to
pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck
array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well  some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming:
 We should have a brief note on why we settled on the name "duck
array". Namely, as discussed in NEP22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes.
 The protocol should use *something* more clearly namespaced as
NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy
function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more selfdescriptive than "duck array".
Chuck _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev peter@entschev.com wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested
thanks!
I've written a small PR on your PR:
https://github.com/pentschev/numpy/pull/1
Essentially, other than typos and copy editing, I'm suggesting that a duckarray could choose to implement __array__ if it so chooses  it should, of course, return an actual numpy array.
I think this could be useful, as much code does require an actual numpy array, and only that class itself knows how best to convert to one.
CHB
My answer to that: "NumPy". Reference: logo at the top of https://numpy.org/neps/index.html .
In NEP30 [1], I've used "NumPy" everywhere, except for references to code, repos, etc., where "numpy" is used. I see there's one occurrence of "Numpy", which was definitely a typo and I had not noticed it until now, but I will address this on a future update, thanks for pointing that out.
[1] https://numpy.org/neps/nep0030duckarrayprotocol.html
On Mon, Sep 16, 2019 at 9:09 PM Chris Barker chris.barker@noaa.gov wrote:
Trivial note:
On the subject of naming things (spelling things??)  should it be:
numpy or Numpy or NumPy ?
All three are in the draft NEP 30 ( mostly "NumPy", I noticed this when reading/copy editing the NEP) . Is there an "official" capitalization?
My preference, would be to use "numpy", and where practicable, use a "computer" font  i.e. ``numpy`` in RST.
But if there is consensus already for anything else, that's fine, I'd just like to know what it is.
CHB
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev peter@entschev.com wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested on clarifying the text. After reading the detailed description, I've decided to add a subsection "Scope" to clarify the scope where NEP30 would be useful. I think the inclusion of this new subsection complements the "Detail description" forming a complete text w.r.t. motivation of the NEP, but feel free to point out disagreements with my suggestion. I've also added a new section "Usage" pointing out how one would use duck array in replacement to np.asarray where relevant.
Regarding the naming discussion, I must say I like the idea of keeping the __array_ prefix, but it seems like that is going to be difficult given that none of the existing ideas so far play very nicely with that. So if the general consensus is to go with __numpy_like__, I would also update the NEP to reflect that changes. FWIW, I particularly neither like nor dislike __numpy_like__, but I don't have any better suggestions than that or keeping the current naming.
Best, Peter
On Thu, Aug 8, 2019 at 3:40 AM Stephan Hoyer shoyer@gmail.com wrote:
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris charlesr.harris@gmail.com wrote:
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer shoyer@gmail.com wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers ralf.gommers@gmail.com wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer shoyer@gmail.com wrote: > > On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers ralf.gommers@gmail.com wrote: > >> >> The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example? >> >> The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations. > > > I think this is covered in NEP22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy  so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree  we definitely should clarify this in the docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" (https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
> > As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well  some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming:
 We should have a brief note on why we settled on the name "duck array". Namely, as discussed in NEP22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes.
 The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more selfdescriptive than "duck array".
Chuck _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion

Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 5266959 voice 7600 Sand Point Way NE (206) 5266329 fax Seattle, WA 98115 (206) 5266317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
What would be the use case for a duckarray to implement __array__ and return a NumPy array? Unless I'm missing something, this seems redundant and one should just use array/asarray functions then. This would also prevent errorhandling, what if the developer intentionally wants a NumPylike array (e.g., the original array passed to the duckarray function) or an exception (instead of coercing to a NumPy array)?
On Mon, Sep 16, 2019 at 9:25 PM Chris Barker chris.barker@noaa.gov wrote:
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev peter@entschev.com wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested
thanks!
I've written a small PR on your PR:
https://github.com/pentschev/numpy/pull/1
Essentially, other than typos and copy editing, I'm suggesting that a duckarray could choose to implement __array__ if it so chooses  it should, of course, return an actual numpy array.
I think this could be useful, as much code does require an actual numpy array, and only that class itself knows how best to convert to one.
CHB

Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 5266959 voice 7600 Sand Point Way NE (206) 5266329 fax Seattle, WA 98115 (206) 5266317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
On Mon, Sep 16, 2019 at 1:45 PM Peter Andreas Entschev peter@entschev.com wrote:
What would be the use case for a duckarray to implement __array__ and return a NumPy array? Unless I'm missing something, this seems redundant and one should just use array/asarray functions then. This would also prevent errorhandling, what if the developer intentionally wants a NumPylike array (e.g., the original array passed to the duckarray function) or an exception (instead of coercing to a NumPy array)?
Dask arrays are a good example. They will want to implement __duck_array__ (or whatever we call it) because they support duck typed versions of NumPy operation. They also (already) implement __array__, so they can converted into NumPy arrays as a fallback. This is convenient for moderately sized dask arrays, e.g., so you can pass one into a matplotlib function.
On Mon, Sep 16, 2019 at 9:25 PM Chris Barker chris.barker@noaa.gov wrote:
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev <
peter@entschev.com> wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested
thanks!
I've written a small PR on your PR:
https://github.com/pentschev/numpy/pull/1
Essentially, other than typos and copy editing, I'm suggesting that a
duckarray could choose to implement __array__ if it so chooses  it should, of course, return an actual numpy array.
I think this could be useful, as much code does require an actual numpy
array, and only that class itself knows how best to convert to one.
CHB

Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 5266959 voice 7600 Sand Point Way NE (206) 5266329 fax Seattle, WA 98115 (206) 5266317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
On Mon, Sep 16, 2019 at 1:42 PM Peter Andreas Entschev peter@entschev.com wrote:
My answer to that: "NumPy". Reference: logo at the top of https://numpy.org/neps/index.html .
Yes, NumPy is the right capitalization
In NEP30 [1], I've used "NumPy" everywhere, except for references to code, repos, etc., where "numpy" is used. I see there's one occurrence of "Numpy", which was definitely a typo and I had not noticed it until now, but I will address this on a future update, thanks for pointing that out.
[1] https://numpy.org/neps/nep0030duckarrayprotocol.html
On Mon, Sep 16, 2019 at 9:09 PM Chris Barker chris.barker@noaa.gov wrote:
Trivial note:
On the subject of naming things (spelling things??)  should it be:
numpy or Numpy or NumPy ?
All three are in the draft NEP 30 ( mostly "NumPy", I noticed this when
reading/copy editing the NEP) . Is there an "official" capitalization?
My preference, would be to use "numpy", and where practicable, use a
"computer" font  i.e. ``numpy`` in RST.
But if there is consensus already for anything else, that's fine, I'd
just like to know what it is.
CHB
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev <
peter@entschev.com> wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested on clarifying the text. After reading the detailed description, I've decided to add a subsection "Scope" to clarify the scope where NEP30 would be useful. I think the inclusion of this new subsection complements the "Detail description" forming a complete text w.r.t. motivation of the NEP, but feel free to point out disagreements with my suggestion. I've also added a new section "Usage" pointing out how one would use duck array in replacement to np.asarray where relevant.
Regarding the naming discussion, I must say I like the idea of keeping the __array_ prefix, but it seems like that is going to be difficult given that none of the existing ideas so far play very nicely with that. So if the general consensus is to go with __numpy_like__, I would also update the NEP to reflect that changes. FWIW, I particularly neither like nor dislike __numpy_like__, but I don't have any better suggestions than that or keeping the current naming.
Best, Peter
On Thu, Aug 8, 2019 at 3:40 AM Stephan Hoyer shoyer@gmail.com wrote:
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris <
charlesr.harris@gmail.com> wrote:
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer shoyer@gmail.com
wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers ralf.gommers@gmail.com
wrote:
> > > On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer shoyer@gmail.com
wrote:
>> >> On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <
ralf.gommers@gmail.com> wrote:
>> >>> >>> The NEP currently does not say who this is meant for. Would you
expect libraries like SciPy to adopt it for example?
>>> >>> The NEP also (understandably) punts on the question of when
something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
>> >> >> I think this is covered in NEP22 already. > > > It's not really. We discussed this briefly in the community call
today, Peter said he will try to add some text.
> > We should not add new functions to NumPy without indicating who is
supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy  so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree  we definitely should clarify this in the
docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" ( https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
>> >> As discussed there, I don't think NumPy is in a good position to
pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
> > > There's no need to pronounce a decisive API that fully covers duck
array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well  some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming:
 We should have a brief note on why we settled on the name "duck
array". Namely, as discussed in NEP22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes.
 The protocol should use *something* more clearly namespaced as
NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy
function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more selfdescriptive than "duck array".
Chuck _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion

Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 5266959 voice 7600 Sand Point Way NE (206) 5266329 fax Seattle, WA 98115 (206) 5266317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
got it, thanks.
I've fixed that typo in a PR I"m working on , too.
CHB
On Mon, Sep 16, 2019 at 2:41 PM Ralf Gommers ralf.gommers@gmail.com wrote:
On Mon, Sep 16, 2019 at 1:42 PM Peter Andreas Entschev peter@entschev.com wrote:
My answer to that: "NumPy". Reference: logo at the top of https://numpy.org/neps/index.html .
Yes, NumPy is the right capitalization
In NEP30 [1], I've used "NumPy" everywhere, except for references to code, repos, etc., where "numpy" is used. I see there's one occurrence of "Numpy", which was definitely a typo and I had not noticed it until now, but I will address this on a future update, thanks for pointing that out.
[1] https://numpy.org/neps/nep0030duckarrayprotocol.html
On Mon, Sep 16, 2019 at 9:09 PM Chris Barker chris.barker@noaa.gov wrote:
Trivial note:
On the subject of naming things (spelling things??)  should it be:
numpy or Numpy or NumPy ?
All three are in the draft NEP 30 ( mostly "NumPy", I noticed this when
reading/copy editing the NEP) . Is there an "official" capitalization?
My preference, would be to use "numpy", and where practicable, use a
"computer" font  i.e. ``numpy`` in RST.
But if there is consensus already for anything else, that's fine, I'd
just like to know what it is.
CHB
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev <
peter@entschev.com> wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested on clarifying the text. After reading the detailed description, I've decided to add a subsection "Scope" to clarify the scope where NEP30 would be useful. I think the inclusion of this new subsection complements the "Detail description" forming a complete text w.r.t. motivation of the NEP, but feel free to point out disagreements with my suggestion. I've also added a new section "Usage" pointing out how one would use duck array in replacement to np.asarray where relevant.
Regarding the naming discussion, I must say I like the idea of keeping the __array_ prefix, but it seems like that is going to be difficult given that none of the existing ideas so far play very nicely with that. So if the general consensus is to go with __numpy_like__, I would also update the NEP to reflect that changes. FWIW, I particularly neither like nor dislike __numpy_like__, but I don't have any better suggestions than that or keeping the current naming.
Best, Peter
On Thu, Aug 8, 2019 at 3:40 AM Stephan Hoyer shoyer@gmail.com wrote:
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris <
charlesr.harris@gmail.com> wrote:
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer shoyer@gmail.com
wrote:
> > On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers <
ralf.gommers@gmail.com> wrote:
>> >> >> On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer shoyer@gmail.com
wrote:
>>> >>> On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <
ralf.gommers@gmail.com> wrote:
>>> >>>> >>>> The NEP currently does not say who this is meant for. Would you
expect libraries like SciPy to adopt it for example?
>>>> >>>> The NEP also (understandably) punts on the question of when
something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
>>> >>> >>> I think this is covered in NEP22 already. >> >> >> It's not really. We discussed this briefly in the community call
today, Peter said he will try to add some text.
>> >> We should not add new functions to NumPy without indicating who
is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy  so it's new libraries first, mature libraries after the dust has settled a bit (I think).
> > > I totally agree  we definitely should clarify this in the
docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" ( https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
> >>> >>> As discussed there, I don't think NumPy is in a good position to
pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
>> >> >> There's no need to pronounce a decisive API that fully covers
duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
> > > OK, I agree here as well  some guidance is better than nothing. > > Two other minor notes on this NEP, concerning naming: > 1. We should have a brief note on why we settled on the name "duck
array". Namely, as discussed in NEP22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes.
> 2. The protocol should use *something* more clearly namespaced as
NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
>
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy
function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more selfdescriptive than "duck array".
Chuck _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion

Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 5266959 voice 7600 Sand Point Way NE (206) 5266329 fax Seattle, WA 98115 (206) 5266317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
On Mon, Sep 16, 2019 at 1:46 PM Peter Andreas Entschev peter@entschev.com wrote:
What would be the use case for a duckarray to implement __array__ and return a NumPy array?
some users need a genuine, actual numpy array (for passing to Cyton code, for example). if __array__ is not implemented, how can they get that from an arraylike object??
Only the author of the arraylike object knows how best to make a numpy array out of it.
Unless I'm missing something, this seems
redundant and one should just use array/asarray functions then.
but if the object does not impliment __array__, then user's can't use the array/asarray functions!
This would also prevent errorhandling, what if the developer intentionally wants a NumPylike array (e.g., the original array passed to the duckarray function) or an exception (instead of coercing to a NumPy array)?
I'm really confused now  if a enduser wants a duckarray, they should call duckarray()  if they want an actual numpy array, they should call .asarray().
Why would anyone want an Exception? If you don't want an array, then don't call asarray()
If you call duckarray(), and the object has not implemented __duckarray__, then you will get an exception  whoch you should.
If you call __array_, and __array__ has not been implimented, then you will get an exception.
what is the potential problem here?
Which makes me think  why should Duck arrays ever implement an __array__ method that raises an Exception? why not jsut not impliment it? (unless you wantt o add some helpful error message  which I did for the example in my PR.
(PR to the numpy repo in progress)
CHB
On Mon, Sep 16, 2019 at 2:27 PM Stephan Hoyer shoyer@gmail.com wrote:
On Mon, Sep 16, 2019 at 1:45 PM Peter Andreas Entschev peter@entschev.com wrote:
What would be the use case for a duckarray to implement __array__ and return a NumPy array?
Dask arrays are a good example. They will want to implement __duck_array__ (or whatever we call it) because they support duck typed versions of NumPy operation. They also (already) implement __array__, so they can converted into NumPy arrays as a fallback. This is convenient for moderately sized dask arrays, e.g., so you can pass one into a matplotlib function.
Exactly.
And I have implemented __array__ in classes that are NOT duck arrays at all (an image class, for instance). But I also can see wanting to support both:
use me as a duck array and convert me into a proper numpy array.
OK  looking again at the NEP, I see this suggested implementation:
def duckarray(array_like): if hasattr(array_like, '__duckarray__'): return array_like.__duckarray__() return np.asarray(array_like)
So I see the point now, if a user wants a duck array  they may not want to accidentally coerce this object to a real array (potentially expensive).
but in this case, asarray() will only get called (and thus __array__ will only get called), if __duckarray__ is not implemented. So the only reason to impliment __array__ and raise and Exception is so that users will get that exception is the specifically call asarray()  why should they get that??
I'm working on a PR with suggestion for this.
CHB
OK  I *finally* got it:
when you pass an arbitrary object into np.asarray(), it will create an array object scalar with the object in it.
So yes, I can see that you may want to raise a TypeError instead, so that users don't get an object array scalar when they wre expecting to get an arraylike object.
So it's probably a good idea to recommend that when a class implements __dauckarray__ that it also implements __array__, which can either raise an exception or return and ndarray.
CHB
On Mon, Sep 16, 2019 at 3:11 PM Chris Barker chris.barker@noaa.gov wrote:
On Mon, Sep 16, 2019 at 2:27 PM Stephan Hoyer shoyer@gmail.com wrote:
On Mon, Sep 16, 2019 at 1:45 PM Peter Andreas Entschev < peter@entschev.com> wrote:
What would be the use case for a duckarray to implement __array__ and return a NumPy array?
Dask arrays are a good example. They will want to implement __duck_array__ (or whatever we call it) because they support duck typed versions of NumPy operation. They also (already) implement __array__, so they can converted into NumPy arrays as a fallback. This is convenient for moderately sized dask arrays, e.g., so you can pass one into a matplotlib function.
Exactly.
And I have implemented __array__ in classes that are NOT duck arrays at all (an image class, for instance). But I also can see wanting to support both:
use me as a duck array and convert me into a proper numpy array.
OK  looking again at the NEP, I see this suggested implementation:
def duckarray(array_like): if hasattr(array_like, '__duckarray__'): return array_like.__duckarray__() return np.asarray(array_like)
So I see the point now, if a user wants a duck array  they may not want to accidentally coerce this object to a real array (potentially expensive).
but in this case, asarray() will only get called (and thus __array__ will only get called), if __duckarray__ is not implemented. So the only reason to impliment __array__ and raise and Exception is so that users will get that exception is the specifically call asarray()  why should they get that??
I'm working on a PR with suggestion for this.
CHB

Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 5266959 voice 7600 Sand Point Way NE (206) 5266329 fax Seattle, WA 98115 (206) 5266317 main reception
Chris.Barker@noaa.gov
Here's a PR with a different dicsussion of __array__:
https://github.com/numpy/numpy/pull/14529
CHB
On Mon, Sep 16, 2019 at 3:23 PM Chris Barker chris.barker@noaa.gov wrote:
OK  I *finally* got it:
when you pass an arbitrary object into np.asarray(), it will create an array object scalar with the object in it.
So yes, I can see that you may want to raise a TypeError instead, so that users don't get an object array scalar when they wre expecting to get an arraylike object.
So it's probably a good idea to recommend that when a class implements __dauckarray__ that it also implements __array__, which can either raise an exception or return and ndarray.
CHB
On Mon, Sep 16, 2019 at 3:11 PM Chris Barker chris.barker@noaa.gov wrote:
On Mon, Sep 16, 2019 at 2:27 PM Stephan Hoyer shoyer@gmail.com wrote:
On Mon, Sep 16, 2019 at 1:45 PM Peter Andreas Entschev < peter@entschev.com> wrote:
What would be the use case for a duckarray to implement __array__ and return a NumPy array?
Dask arrays are a good example. They will want to implement __duck_array__ (or whatever we call it) because they support duck typed versions of NumPy operation. They also (already) implement __array__, so they can converted into NumPy arrays as a fallback. This is convenient for moderately sized dask arrays, e.g., so you can pass one into a matplotlib function.
Exactly.
And I have implemented __array__ in classes that are NOT duck arrays at all (an image class, for instance). But I also can see wanting to support both:
use me as a duck array and convert me into a proper numpy array.
OK  looking again at the NEP, I see this suggested implementation:
def duckarray(array_like): if hasattr(array_like, '__duckarray__'): return array_like.__duckarray__() return np.asarray(array_like)
So I see the point now, if a user wants a duck array  they may not want to accidentally coerce this object to a real array (potentially expensive).
but in this case, asarray() will only get called (and thus __array__ will only get called), if __duckarray__ is not implemented. So the only reason to impliment __array__ and raise and Exception is so that users will get that exception is the specifically call asarray()  why should they get that??
I'm working on a PR with suggestion for this.
CHB

Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 5266959 voice 7600 Sand Point Way NE (206) 5266329 fax Seattle, WA 98115 (206) 5266317 main reception
Chris.Barker@noaa.gov

Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 5266959 voice 7600 Sand Point Way NE (206) 5266329 fax Seattle, WA 98115 (206) 5266317 main reception
Chris.Barker@noaa.gov
I see what you mean now. It was my misunderstanding, I thought you wanted to return a call to __array__ when you call np.duckarray.
I agree with your point and understand how the current text may be misleading, so we shall make it clearer in the NEP (as done in https://github.com/numpy/numpy/pull/14529) that both are valid ways:
* Have a genuine implementation of __array__ (like Dask, as pointed out by Stephan); or * Raise an exception (as CuPy does).
Thanks for opening the PR, I will comment there as well.
On Tue, Sep 17, 2019 at 6:56 AM Peter Andreas Entschev peter@entschev.com wrote:
I agree with your point and understand how the current text may be misleading, so we shall make it clearer in the NEP (as done in https://github.com/numpy/numpy/pull/14529) that both are valid ways:
 Have a genuine implementation of __array__ (like Dask, as pointed
out by Stephan); or
 Raise an exception (as CuPy does).
great  sounds like we're all (well three of us anyway) are on teh same page.
Just need to sort out the text.
CHB
Thanks for opening the PR, I will comment there as well. _______________________________________________ NumPyDiscussion mailing list NumPyDiscussion@python.org https://mail.python.org/mailman/listinfo/numpydiscussion
participants (6)

Charles R Harris

Chris Barker

Peter Andreas Entschev

Ralf Gommers

Sebastian Berg

Stephan Hoyer