NEP 30 - Duck Typing for NumPy Arrays - Implementation
Hi, we have a new proposal for the implementation of NumPy array duck typing [1] [2], following the high-level overview described in NEP-22 [3]. Would be great to get some comments on that. [1] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0030-duck-array-prot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html Best, Peter
Hi Peter, thanks for writing that up! On Mon, Aug 5, 2019 at 8:07 AM Peter Andreas Entschev <peter@entschev.com> wrote:
Hi,
we have a new proposal for the implementation of NumPy array duck typing [1] [2], following the high-level overview described in NEP-22 [3].
A couple of high level comments: Having __array__ give a TypeError is fine for libraries that want to prevent unintentional coercion with, e.g., `np.asarray(my_ducktype)`. However that leaves the obvious question of what the right way is to do this intentionally. Would be good to recommend something, for example a `numpy()` or `to_numpy()` method. Also, the NEP should make it clearer that this is not the obviously correct thing to do, it only makes sense in cases where coercion is very expensive, like CuPy and Sparse. For Dask for example, coercion to a numpy array is perfectly reasonable. The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example? The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations. An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least. Cheers, Ralf
Would be great to get some comments on that.
[1] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0030-duck-array-prot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html
Best, Peter _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
Having __array__ give a TypeError is fine for libraries that want to prevent unintentional coercion with, e.g., `np.asarray(my_ducktype)`. However that leaves the obvious question of what the right way is to do this intentionally. Would be good to recommend something, for example a `numpy()` or `to_numpy()` method. Also, the NEP should make it clearer that this is not the obviously correct thing to do, it only makes sense in cases where coercion is very expensive, like CuPy and Sparse. For Dask for example, coercion to a numpy array is perfectly reasonable.
I agree, we need another solution for explicit array conversion, either from duck arrays to NumPy arrays or between duck arrays. As has come-up on GitHub [1], think this should probably be another protocol, to allow for third-party conversions like sparse <-> dask that in principle could be implemented by either library. To get discussion start, here's one possible proposal for what the NumPy API(s) could look like: np.coerce(sparse_array) # by default, coerce to np.ndarray np.coerce(sparse_array, dask.array.Array) # coerces to dask np.coerce_like(sparse_array, dask_array) # coerce like the second array type np.coerce_arrays(list_of_arrays) # coerce to first type that can handle everything The protocol itself should probably either use __array_function__ (e.g., for np.coerce_like, if all the dispatched on arguments are arrays) or a custom protocol in the same style that allows for implementations on either the array being converted or the type of the result [2]. [1] https://github.com/numpy/numpy/issues/13831 [2] https://github.com/numpy/numpy/pull/14170#issuecomment-517004293
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP-22 already. As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now. An alternative to introducing np.duckarray() would be to just modify
np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
Cheers, Ralf
Would be great to get some comments on that.
[1] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0030-duck-array-prot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html
Best, Peter _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Thanks for the concerns raised, and Stephan for promptly answering them.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
I don't know if mentioning alternatives in that way is good, it gives me the impression that NumPy is encouraging them (unless it really is). In that sense, I think the best is indeed going down the path of finding a good coercion solution (as Stephan already mentioned) and later we could just add a pointer to the new NEP in this one. Best, Peter On Tue, Aug 6, 2019 at 3:17 AM Stephan Hoyer <shoyer@gmail.com> wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
Having __array__ give a TypeError is fine for libraries that want to prevent unintentional coercion with, e.g., `np.asarray(my_ducktype)`. However that leaves the obvious question of what the right way is to do this intentionally. Would be good to recommend something, for example a `numpy()` or `to_numpy()` method. Also, the NEP should make it clearer that this is not the obviously correct thing to do, it only makes sense in cases where coercion is very expensive, like CuPy and Sparse. For Dask for example, coercion to a numpy array is perfectly reasonable.
I agree, we need another solution for explicit array conversion, either from duck arrays to NumPy arrays or between duck arrays.
As has come-up on GitHub [1], think this should probably be another protocol, to allow for third-party conversions like sparse <-> dask that in principle could be implemented by either library.
To get discussion start, here's one possible proposal for what the NumPy API(s) could look like: np.coerce(sparse_array) # by default, coerce to np.ndarray np.coerce(sparse_array, dask.array.Array) # coerces to dask np.coerce_like(sparse_array, dask_array) # coerce like the second array type np.coerce_arrays(list_of_arrays) # coerce to first type that can handle everything
The protocol itself should probably either use __array_function__ (e.g., for np.coerce_like, if all the dispatched on arguments are arrays) or a custom protocol in the same style that allows for implementations on either the array being converted or the type of the result [2].
[1] https://github.com/numpy/numpy/issues/13831 [2] https://github.com/numpy/numpy/pull/14170#issuecomment-517004293
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP-22 already. As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
Cheers, Ralf
Would be great to get some comments on that.
[1] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0030-duck-array-prot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html
Best, Peter _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
On Tue, 2019-08-06 at 10:24 +0200, Peter Andreas Entschev wrote:
Thanks for the concerns raised, and Stephan for promptly answering them.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
I don't know if mentioning alternatives in that way is good, it gives me the impression that NumPy is encouraging them (unless it really is).
Well, if you think alternatives is too open to actually using them, I think it would be fine to list them as "Rejected Alternatives"? - Sebastian
In that sense, I think the best is indeed going down the path of finding a good coercion solution (as Stephan already mentioned) and later we could just add a pointer to the new NEP in this one.
Best, Peter
On Tue, Aug 6, 2019 at 3:17 AM Stephan Hoyer <shoyer@gmail.com> wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com
wrote: Having __array__ give a TypeError is fine for libraries that want to prevent unintentional coercion with, e.g., `np.asarray(my_ducktype)`. However that leaves the obvious question of what the right way is to do this intentionally. Would be good to recommend something, for example a `numpy()` or `to_numpy()` method. Also, the NEP should make it clearer that this is not the obviously correct thing to do, it only makes sense in cases where coercion is very expensive, like CuPy and Sparse. For Dask for example, coercion to a numpy array is perfectly reasonable.
I agree, we need another solution for explicit array conversion, either from duck arrays to NumPy arrays or between duck arrays.
As has come-up on GitHub [1], think this should probably be another protocol, to allow for third-party conversions like sparse <-> dask that in principle could be implemented by either library.
To get discussion start, here's one possible proposal for what the NumPy API(s) could look like: np.coerce(sparse_array) # by default, coerce to np.ndarray np.coerce(sparse_array, dask.array.Array) # coerces to dask np.coerce_like(sparse_array, dask_array) # coerce like the second array type np.coerce_arrays(list_of_arrays) # coerce to first type that can handle everything
The protocol itself should probably either use __array_function__ (e.g., for np.coerce_like, if all the dispatched on arguments are arrays) or a custom protocol in the same style that allows for implementations on either the array being converted or the type of the result [2].
[1] https://github.com/numpy/numpy/issues/13831 [2] https://github.com/numpy/numpy/pull/14170#issuecomment-517004293
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP-22 already. As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
Cheers, Ralf
Would be great to get some comments on that.
[1] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0030-duck-array-prot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html
Best, Peter _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Sure, I wouldn't mind doing that, but it would also be better to have clear alternative/complement to duck array (as I'm hoping to be the case with with coerce). I will try to give a bit more thought on the coercion ideas and start writing a NEP for that this week and the next. Perhaps we can then sync both NEPs. On Tue, Aug 6, 2019 at 4:23 PM Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Tue, 2019-08-06 at 10:24 +0200, Peter Andreas Entschev wrote:
Thanks for the concerns raised, and Stephan for promptly answering them.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
I don't know if mentioning alternatives in that way is good, it gives me the impression that NumPy is encouraging them (unless it really is).
Well, if you think alternatives is too open to actually using them, I think it would be fine to list them as "Rejected Alternatives"?
- Sebastian
In that sense, I think the best is indeed going down the path of finding a good coercion solution (as Stephan already mentioned) and later we could just add a pointer to the new NEP in this one.
Best, Peter
On Tue, Aug 6, 2019 at 3:17 AM Stephan Hoyer <shoyer@gmail.com> wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com
wrote: Having __array__ give a TypeError is fine for libraries that want to prevent unintentional coercion with, e.g., `np.asarray(my_ducktype)`. However that leaves the obvious question of what the right way is to do this intentionally. Would be good to recommend something, for example a `numpy()` or `to_numpy()` method. Also, the NEP should make it clearer that this is not the obviously correct thing to do, it only makes sense in cases where coercion is very expensive, like CuPy and Sparse. For Dask for example, coercion to a numpy array is perfectly reasonable.
I agree, we need another solution for explicit array conversion, either from duck arrays to NumPy arrays or between duck arrays.
As has come-up on GitHub [1], think this should probably be another protocol, to allow for third-party conversions like sparse <-> dask that in principle could be implemented by either library.
To get discussion start, here's one possible proposal for what the NumPy API(s) could look like: np.coerce(sparse_array) # by default, coerce to np.ndarray np.coerce(sparse_array, dask.array.Array) # coerces to dask np.coerce_like(sparse_array, dask_array) # coerce like the second array type np.coerce_arrays(list_of_arrays) # coerce to first type that can handle everything
The protocol itself should probably either use __array_function__ (e.g., for np.coerce_like, if all the dispatched on arguments are arrays) or a custom protocol in the same style that allows for implementations on either the array being converted or the type of the result [2].
[1] https://github.com/numpy/numpy/issues/13831 [2] https://github.com/numpy/numpy/pull/14170#issuecomment-517004293
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP-22 already. As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
An alternative to introducing np.duckarray() would be to just modify np.asarray(). Of course this has backwards compatibility impact, but if you're going to be raising a TypeError from __array__ then that impact is there anyway. Note: I don't think this is necessarily a better idea, because it may lead to less clear errors, but it's worth putting in the alternatives section at least.
Cheers, Ralf
Would be great to get some comments on that.
[1] https://github.com/numpy/numpy/blob/master/doc/neps/nep-0030-duck-array-prot... [2] https://github.com/numpy/numpy/pull/14170 [3] https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html
Best, Peter _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP-22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text. We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy - so it's new libraries first, mature libraries after the dust has settled a bit (I think). As discussed there, I don't think NumPy is in a good position to pronounce
decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing. Cheers, Ralf
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP-22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy - so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree -- we definitely should clarify this in the docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" (https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
As discussed there, I don't think NumPy is in a good position to pronounce
decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well -- some guidance is better than nothing. Two other minor notes on this NEP, concerning naming: 1. We should have a brief note on why we settled on the name "duck array". Namely, as discussed in NEP-22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes. 2. The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP-22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy - so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree -- we definitely should clarify this in the docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" (https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
As discussed there, I don't think NumPy is in a good position to
pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well -- some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming: 1. We should have a brief note on why we settled on the name "duck array". Namely, as discussed in NEP-22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes. 2. The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ? Chuck
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP-22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy - so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree -- we definitely should clarify this in the docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" (https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
As discussed there, I don't think NumPy is in a good position to
pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well -- some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming: 1. We should have a brief note on why we settled on the name "duck array". Namely, as discussed in NEP-22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes. 2. The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more self-descriptive than "duck array".
Chuck _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested on clarifying the text. After reading the detailed description, I've decided to add a subsection "Scope" to clarify the scope where NEP-30 would be useful. I think the inclusion of this new subsection complements the "Detail description" forming a complete text w.r.t. motivation of the NEP, but feel free to point out disagreements with my suggestion. I've also added a new section "Usage" pointing out how one would use duck array in replacement to np.asarray where relevant. Regarding the naming discussion, I must say I like the idea of keeping the __array_ prefix, but it seems like that is going to be difficult given that none of the existing ideas so far play very nicely with that. So if the general consensus is to go with __numpy_like__, I would also update the NEP to reflect that changes. FWIW, I particularly neither like nor dislike __numpy_like__, but I don't have any better suggestions than that or keeping the current naming. Best, Peter On Thu, Aug 8, 2019 at 3:40 AM Stephan Hoyer <shoyer@gmail.com> wrote:
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example?
The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP-22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy - so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree -- we definitely should clarify this in the docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" (https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well -- some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming: 1. We should have a brief note on why we settled on the name "duck array". Namely, as discussed in NEP-22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes. 2. The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more self-descriptive than "duck array".
Chuck _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
Trivial note: On the subject of naming things (spelling things??) -- should it be: numpy or Numpy or NumPy ? All three are in the draft NEP 30 ( mostly "NumPy", I noticed this when reading/copy editing the NEP) . Is there an "official" capitalization? My preference, would be to use "numpy", and where practicable, use a "computer" font -- i.e. ``numpy`` in RST. But if there is consensus already for anything else, that's fine, I'd just like to know what it is. -CHB On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev <peter@entschev.com> wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested on clarifying the text. After reading the detailed description, I've decided to add a subsection "Scope" to clarify the scope where NEP-30 would be useful. I think the inclusion of this new subsection complements the "Detail description" forming a complete text w.r.t. motivation of the NEP, but feel free to point out disagreements with my suggestion. I've also added a new section "Usage" pointing out how one would use duck array in replacement to np.asarray where relevant.
Regarding the naming discussion, I must say I like the idea of keeping the __array_ prefix, but it seems like that is going to be difficult given that none of the existing ideas so far play very nicely with that. So if the general consensus is to go with __numpy_like__, I would also update the NEP to reflect that changes. FWIW, I particularly neither like nor dislike __numpy_like__, but I don't have any better suggestions than that or keeping the current naming.
Best, Peter
On Thu, Aug 8, 2019 at 3:40 AM Stephan Hoyer <shoyer@gmail.com> wrote:
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris <
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers <ralf.gommers@gmail.com>
wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer <shoyer@gmail.com>
wrote:
On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com>
wrote:
> > The NEP currently does not say who this is meant for. Would you
expect libraries like SciPy to adopt it for example?
> > The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations.
I think this is covered in NEP-22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems
I totally agree -- we definitely should clarify this in the docstring
and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" ( https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
As discussed there, I don't think NumPy is in a good position to
charlesr.harris@gmail.com> wrote: pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy - so it's new libraries first, mature libraries after the dust has settled a bit (I think). pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck
array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well -- some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming: 1. We should have a brief note on why we settled on the name "duck array". Namely, as discussed in NEP-22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes. 2. The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more self-descriptive than "duck array".
Chuck _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
My answer to that: "NumPy". Reference: logo at the top of https://numpy.org/neps/index.html . In NEP-30 [1], I've used "NumPy" everywhere, except for references to code, repos, etc., where "numpy" is used. I see there's one occurrence of "Numpy", which was definitely a typo and I had not noticed it until now, but I will address this on a future update, thanks for pointing that out. [1] https://numpy.org/neps/nep-0030-duck-array-protocol.html On Mon, Sep 16, 2019 at 9:09 PM Chris Barker <chris.barker@noaa.gov> wrote:
Trivial note:
On the subject of naming things (spelling things??) -- should it be:
numpy or Numpy or NumPy ?
All three are in the draft NEP 30 ( mostly "NumPy", I noticed this when reading/copy editing the NEP) . Is there an "official" capitalization?
My preference, would be to use "numpy", and where practicable, use a "computer" font -- i.e. ``numpy`` in RST.
But if there is consensus already for anything else, that's fine, I'd just like to know what it is.
-CHB
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev <peter@entschev.com> wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested on clarifying the text. After reading the detailed description, I've decided to add a subsection "Scope" to clarify the scope where NEP-30 would be useful. I think the inclusion of this new subsection complements the "Detail description" forming a complete text w.r.t. motivation of the NEP, but feel free to point out disagreements with my suggestion. I've also added a new section "Usage" pointing out how one would use duck array in replacement to np.asarray where relevant.
Regarding the naming discussion, I must say I like the idea of keeping the __array_ prefix, but it seems like that is going to be difficult given that none of the existing ideas so far play very nicely with that. So if the general consensus is to go with __numpy_like__, I would also update the NEP to reflect that changes. FWIW, I particularly neither like nor dislike __numpy_like__, but I don't have any better suggestions than that or keeping the current naming.
Best, Peter
On Thu, Aug 8, 2019 at 3:40 AM Stephan Hoyer <shoyer@gmail.com> wrote:
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris <charlesr.harris@gmail.com> wrote:
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer <shoyer@gmail.com> wrote: > > On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers <ralf.gommers@gmail.com> wrote: > >> >> The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example? >> >> The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations. > > > I think this is covered in NEP-22 already.
It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text.
We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy - so it's new libraries first, mature libraries after the dust has settled a bit (I think).
I totally agree -- we definitely should clarify this in the docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" (https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
> > As discussed there, I don't think NumPy is in a good position to pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well -- some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming: 1. We should have a brief note on why we settled on the name "duck array". Namely, as discussed in NEP-22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes. 2. The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more self-descriptive than "duck array".
Chuck _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
On Mon, Sep 16, 2019 at 1:42 PM Peter Andreas Entschev <peter@entschev.com> wrote:
My answer to that: "NumPy". Reference: logo at the top of https://numpy.org/neps/index.html .
Yes, NumPy is the right capitalization
In NEP-30 [1], I've used "NumPy" everywhere, except for references to code, repos, etc., where "numpy" is used. I see there's one occurrence of "Numpy", which was definitely a typo and I had not noticed it until now, but I will address this on a future update, thanks for pointing that out.
[1] https://numpy.org/neps/nep-0030-duck-array-protocol.html
On Mon, Sep 16, 2019 at 9:09 PM Chris Barker <chris.barker@noaa.gov> wrote:
Trivial note:
On the subject of naming things (spelling things??) -- should it be:
numpy or Numpy or NumPy ?
All three are in the draft NEP 30 ( mostly "NumPy", I noticed this when
reading/copy editing the NEP) . Is there an "official" capitalization?
My preference, would be to use "numpy", and where practicable, use a
"computer" font -- i.e. ``numpy`` in RST.
But if there is consensus already for anything else, that's fine, I'd
just like to know what it is.
-CHB
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev <
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested on clarifying the text. After reading the detailed description, I've decided to add a subsection "Scope" to clarify the scope where NEP-30 would be useful. I think the inclusion of this new subsection complements the "Detail description" forming a complete text w.r.t. motivation of the NEP, but feel free to point out disagreements with my suggestion. I've also added a new section "Usage" pointing out how one would use duck array in replacement to np.asarray where relevant.
Regarding the naming discussion, I must say I like the idea of keeping the __array_ prefix, but it seems like that is going to be difficult given that none of the existing ideas so far play very nicely with that. So if the general consensus is to go with __numpy_like__, I would also update the NEP to reflect that changes. FWIW, I particularly neither like nor dislike __numpy_like__, but I don't have any better suggestions than that or keeping the current naming.
Best, Peter
On Thu, Aug 8, 2019 at 3:40 AM Stephan Hoyer <shoyer@gmail.com> wrote:
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris <
charlesr.harris@gmail.com> wrote:
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer <shoyer@gmail.com>
wrote:
On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers <ralf.gommers@gmail.com>
wrote:
> > > On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer <shoyer@gmail.com> wrote: >> >> On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers < ralf.gommers@gmail.com> wrote: >> >>> >>> The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example? >>> >>> The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations. >> >> >> I think this is covered in NEP-22 already. > > > It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text. > > We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems
I totally agree -- we definitely should clarify this in the
docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" ( https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate.
>> >> As discussed there, I don't think NumPy is in a good position to
peter@entschev.com> wrote: pretty clear to me that it's mostly aimed at library authors rather than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy - so it's new libraries first, mature libraries after the dust has settled a bit (I think). pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
> > > There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing.
OK, I agree here as well -- some guidance is better than nothing.
Two other minor notes on this NEP, concerning naming: 1. We should have a brief note on why we settled on the name "duck array". Namely, as discussed in NEP-22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes. 2. The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols).
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more self-descriptive than "duck array".
Chuck _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
got it, thanks. I've fixed that typo in a PR I"m working on , too. -CHB On Mon, Sep 16, 2019 at 2:41 PM Ralf Gommers <ralf.gommers@gmail.com> wrote:
On Mon, Sep 16, 2019 at 1:42 PM Peter Andreas Entschev <peter@entschev.com> wrote:
My answer to that: "NumPy". Reference: logo at the top of https://numpy.org/neps/index.html .
Yes, NumPy is the right capitalization
In NEP-30 [1], I've used "NumPy" everywhere, except for references to code, repos, etc., where "numpy" is used. I see there's one occurrence of "Numpy", which was definitely a typo and I had not noticed it until now, but I will address this on a future update, thanks for pointing that out.
[1] https://numpy.org/neps/nep-0030-duck-array-protocol.html
On Mon, Sep 16, 2019 at 9:09 PM Chris Barker <chris.barker@noaa.gov> wrote:
Trivial note:
On the subject of naming things (spelling things??) -- should it be:
numpy or Numpy or NumPy ?
All three are in the draft NEP 30 ( mostly "NumPy", I noticed this when
reading/copy editing the NEP) . Is there an "official" capitalization?
My preference, would be to use "numpy", and where practicable, use a
"computer" font -- i.e. ``numpy`` in RST.
But if there is consensus already for anything else, that's fine, I'd
just like to know what it is.
-CHB
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev <
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested on clarifying the text. After reading the detailed description, I've decided to add a subsection "Scope" to clarify the scope where NEP-30 would be useful. I think the inclusion of this new subsection complements the "Detail description" forming a complete text w.r.t. motivation of the NEP, but feel free to point out disagreements with my suggestion. I've also added a new section "Usage" pointing out how one would use duck array in replacement to np.asarray where relevant.
Regarding the naming discussion, I must say I like the idea of keeping the __array_ prefix, but it seems like that is going to be difficult given that none of the existing ideas so far play very nicely with that. So if the general consensus is to go with __numpy_like__, I would also update the NEP to reflect that changes. FWIW, I particularly neither like nor dislike __numpy_like__, but I don't have any better suggestions than that or keeping the current naming.
Best, Peter
On Thu, Aug 8, 2019 at 3:40 AM Stephan Hoyer <shoyer@gmail.com> wrote:
On Wed, Aug 7, 2019 at 6:18 PM Charles R Harris <
charlesr.harris@gmail.com> wrote:
On Wed, Aug 7, 2019 at 7:10 PM Stephan Hoyer <shoyer@gmail.com>
wrote:
> > On Wed, Aug 7, 2019 at 5:11 PM Ralf Gommers < ralf.gommers@gmail.com> wrote: >> >> >> On Mon, Aug 5, 2019 at 6:18 PM Stephan Hoyer <shoyer@gmail.com> wrote: >>> >>> On Mon, Aug 5, 2019 at 2:48 PM Ralf Gommers < ralf.gommers@gmail.com> wrote: >>> >>>> >>>> The NEP currently does not say who this is meant for. Would you expect libraries like SciPy to adopt it for example? >>>> >>>> The NEP also (understandably) punts on the question of when something is a valid duck array. If you want this to be widely used, that will need an answer or at least some rough guidance though. For example, we would expect a duck array to have a mean() method, but probably not a ptp() method. A library author who wants to use np.duckarray() needs to know, because she can't test with all existing and future duck array implementations. >>> >>> >>> I think this is covered in NEP-22 already. >> >> >> It's not really. We discussed this briefly in the community call today, Peter said he will try to add some text. >> >> We should not add new functions to NumPy without indicating who is supposed to use this, and what need it fills / problem it solves. It seems pretty clear to me that it's mostly aimed at library authors rather
> > > I totally agree -- we definitely should clarify this in the docstring and elsewhere in the docs. An example in the new doc page on "Writing custom array containers" ( https://numpy.org/devdocs/user/basics.dispatch.html) would also probably be appropriate. > >>> >>> As discussed there, I don't think NumPy is in a good position to
peter@entschev.com> wrote: than end users. And also that mature libraries like SciPy may not immediately adopt it, because it's too fuzzy - so it's new libraries first, mature libraries after the dust has settled a bit (I think). pronounce decisive APIs at this time. I would welcome efforts to try, but I don't think that's essential for now.
>> >> >> There's no need to pronounce a decisive API that fully covers duck array. Note that RNumPy is an attempt in that direction (not a full one, but way better than nothing). In the NEP/docs, at least saying something along the lines of "if you implement this, we recommend the following strategy: check if a function is present in Dask, CuPy and Sparse. If so, it's reasonable to expect any duck array to work here. If not, we suggest you indicate in your docstring what kinds of duck arrays are accepted, or what properties they need to have". That's a spec by implementation, which is less than ideal but better than saying nothing. > > > OK, I agree here as well -- some guidance is better than nothing. > > Two other minor notes on this NEP, concerning naming: > 1. We should have a brief note on why we settled on the name "duck array". Namely, as discussed in NEP-22, we don't love the "duck" jargon, but we couldn't come up with anything better since NumPy already uses "array like" and "any array" for different purposes. > 2. The protocol should use *something* more clearly namespaced as NumPy specific than __duckarray__. All the other special protocols NumPy defines start with "__array_". That suggests either __array_duckarray__ (sounds a little redundant) or __numpy_duckarray__ (which I like the look of, but is a different from the existing protocols). >
`__numpy_like__` ?
This could work, but I think we would also want to rename the NumPy function itself to either np.like or np.numpy_like. The later is a little redundant but definitely more self-descriptive than "duck array".
Chuck _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev <peter@entschev.com> wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested
thanks! I've written a small PR on your PR: https://github.com/pentschev/numpy/pull/1 Essentially, other than typos and copy editing, I'm suggesting that a duck-array could choose to implement __array__ if it so chooses -- it should, of course, return an actual numpy array. I think this could be useful, as much code does require an actual numpy array, and only that class itself knows how best to convert to one. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
What would be the use case for a duck-array to implement __array__ and return a NumPy array? Unless I'm missing something, this seems redundant and one should just use array/asarray functions then. This would also prevent error-handling, what if the developer intentionally wants a NumPy-like array (e.g., the original array passed to the duckarray function) or an exception (instead of coercing to a NumPy array)? On Mon, Sep 16, 2019 at 9:25 PM Chris Barker <chris.barker@noaa.gov> wrote:
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev <peter@entschev.com> wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested
thanks!
I've written a small PR on your PR:
https://github.com/pentschev/numpy/pull/1
Essentially, other than typos and copy editing, I'm suggesting that a duck-array could choose to implement __array__ if it so chooses -- it should, of course, return an actual numpy array.
I think this could be useful, as much code does require an actual numpy array, and only that class itself knows how best to convert to one.
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
On Mon, Sep 16, 2019 at 1:45 PM Peter Andreas Entschev <peter@entschev.com> wrote:
What would be the use case for a duck-array to implement __array__ and return a NumPy array? Unless I'm missing something, this seems redundant and one should just use array/asarray functions then. This would also prevent error-handling, what if the developer intentionally wants a NumPy-like array (e.g., the original array passed to the duckarray function) or an exception (instead of coercing to a NumPy array)?
Dask arrays are a good example. They will want to implement __duck_array__ (or whatever we call it) because they support duck typed versions of NumPy operation. They also (already) implement __array__, so they can converted into NumPy arrays as a fallback. This is convenient for moderately sized dask arrays, e.g., so you can pass one into a matplotlib function.
On Mon, Sep 16, 2019 at 9:25 PM Chris Barker <chris.barker@noaa.gov> wrote:
On Mon, Aug 12, 2019 at 4:02 AM Peter Andreas Entschev <
peter@entschev.com> wrote:
Apologies for the late reply. I've opened a new PR https://github.com/numpy/numpy/pull/14257 with the changes requested
thanks!
I've written a small PR on your PR:
https://github.com/pentschev/numpy/pull/1
Essentially, other than typos and copy editing, I'm suggesting that a duck-array could choose to implement __array__ if it so chooses -- it should, of course, return an actual numpy array.
I think this could be useful, as much code does require an actual numpy array, and only that class itself knows how best to convert to one.
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
On Mon, Sep 16, 2019 at 2:27 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Mon, Sep 16, 2019 at 1:45 PM Peter Andreas Entschev <peter@entschev.com> wrote:
What would be the use case for a duck-array to implement __array__ and return a NumPy array?
Dask arrays are a good example. They will want to implement __duck_array__ (or whatever we call it) because they support duck typed versions of NumPy operation. They also (already) implement __array__, so they can converted into NumPy arrays as a fallback. This is convenient for moderately sized dask arrays, e.g., so you can pass one into a matplotlib function.
Exactly. And I have implemented __array__ in classes that are NOT duck arrays at all (an image class, for instance). But I also can see wanting to support both: use me as a duck array and convert me into a proper numpy array. OK -- looking again at the NEP, I see this suggested implementation: def duckarray(array_like): if hasattr(array_like, '__duckarray__'): return array_like.__duckarray__() return np.asarray(array_like) So I see the point now, if a user wants a duck array -- they may not want to accidentally coerce this object to a real array (potentially expensive). but in this case, asarray() will only get called (and thus __array__ will only get called), if __duckarray__ is not implemented. So the only reason to impliment __array__ and raise and Exception is so that users will get that exception is the specifically call asarray() -- why should they get that?? I'm working on a PR with suggestion for this. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
OK -- I *finally* got it: when you pass an arbitrary object into np.asarray(), it will create an array object scalar with the object in it. So yes, I can see that you may want to raise a TypeError instead, so that users don't get an object array scalar when they wre expecting to get an array-like object. So it's probably a good idea to recommend that when a class implements __dauckarray__ that it also implements __array__, which can either raise an exception or return and ndarray. -CHB On Mon, Sep 16, 2019 at 3:11 PM Chris Barker <chris.barker@noaa.gov> wrote:
On Mon, Sep 16, 2019 at 2:27 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Mon, Sep 16, 2019 at 1:45 PM Peter Andreas Entschev < peter@entschev.com> wrote:
What would be the use case for a duck-array to implement __array__ and return a NumPy array?
Dask arrays are a good example. They will want to implement __duck_array__ (or whatever we call it) because they support duck typed versions of NumPy operation. They also (already) implement __array__, so they can converted into NumPy arrays as a fallback. This is convenient for moderately sized dask arrays, e.g., so you can pass one into a matplotlib function.
Exactly.
And I have implemented __array__ in classes that are NOT duck arrays at all (an image class, for instance). But I also can see wanting to support both:
use me as a duck array and convert me into a proper numpy array.
OK -- looking again at the NEP, I see this suggested implementation:
def duckarray(array_like): if hasattr(array_like, '__duckarray__'): return array_like.__duckarray__() return np.asarray(array_like)
So I see the point now, if a user wants a duck array -- they may not want to accidentally coerce this object to a real array (potentially expensive).
but in this case, asarray() will only get called (and thus __array__ will only get called), if __duckarray__ is not implemented. So the only reason to impliment __array__ and raise and Exception is so that users will get that exception is the specifically call asarray() -- why should they get that??
I'm working on a PR with suggestion for this.
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
Here's a PR with a different dicsussion of __array__: https://github.com/numpy/numpy/pull/14529 -CHB On Mon, Sep 16, 2019 at 3:23 PM Chris Barker <chris.barker@noaa.gov> wrote:
OK -- I *finally* got it:
when you pass an arbitrary object into np.asarray(), it will create an array object scalar with the object in it.
So yes, I can see that you may want to raise a TypeError instead, so that users don't get an object array scalar when they wre expecting to get an array-like object.
So it's probably a good idea to recommend that when a class implements __dauckarray__ that it also implements __array__, which can either raise an exception or return and ndarray.
-CHB
On Mon, Sep 16, 2019 at 3:11 PM Chris Barker <chris.barker@noaa.gov> wrote:
On Mon, Sep 16, 2019 at 2:27 PM Stephan Hoyer <shoyer@gmail.com> wrote:
On Mon, Sep 16, 2019 at 1:45 PM Peter Andreas Entschev < peter@entschev.com> wrote:
What would be the use case for a duck-array to implement __array__ and return a NumPy array?
Dask arrays are a good example. They will want to implement __duck_array__ (or whatever we call it) because they support duck typed versions of NumPy operation. They also (already) implement __array__, so they can converted into NumPy arrays as a fallback. This is convenient for moderately sized dask arrays, e.g., so you can pass one into a matplotlib function.
Exactly.
And I have implemented __array__ in classes that are NOT duck arrays at all (an image class, for instance). But I also can see wanting to support both:
use me as a duck array and convert me into a proper numpy array.
OK -- looking again at the NEP, I see this suggested implementation:
def duckarray(array_like): if hasattr(array_like, '__duckarray__'): return array_like.__duckarray__() return np.asarray(array_like)
So I see the point now, if a user wants a duck array -- they may not want to accidentally coerce this object to a real array (potentially expensive).
but in this case, asarray() will only get called (and thus __array__ will only get called), if __duckarray__ is not implemented. So the only reason to impliment __array__ and raise and Exception is so that users will get that exception is the specifically call asarray() -- why should they get that??
I'm working on a PR with suggestion for this.
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
I see what you mean now. It was my misunderstanding, I thought you wanted to return a call to __array__ when you call np.duckarray. I agree with your point and understand how the current text may be misleading, so we shall make it clearer in the NEP (as done in https://github.com/numpy/numpy/pull/14529) that both are valid ways: * Have a genuine implementation of __array__ (like Dask, as pointed out by Stephan); or * Raise an exception (as CuPy does). Thanks for opening the PR, I will comment there as well.
On Tue, Sep 17, 2019 at 6:56 AM Peter Andreas Entschev <peter@entschev.com> wrote:
I agree with your point and understand how the current text may be misleading, so we shall make it clearer in the NEP (as done in https://github.com/numpy/numpy/pull/14529) that both are valid ways:
* Have a genuine implementation of __array__ (like Dask, as pointed out by Stephan); or * Raise an exception (as CuPy does).
great -- sounds like we're all (well three of us anyway) are on teh same page. Just need to sort out the text. -CHB
Thanks for opening the PR, I will comment there as well. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
On Mon, Sep 16, 2019 at 1:46 PM Peter Andreas Entschev <peter@entschev.com> wrote:
What would be the use case for a duck-array to implement __array__ and return a NumPy array?
some users need a genuine, actual numpy array (for passing to Cyton code, for example). if __array__ is not implemented, how can they get that from an array-like object?? Only the author of the array-like object knows how best to make a numpy array out of it. Unless I'm missing something, this seems
redundant and one should just use array/asarray functions then.
but if the object does not impliment __array__, then user's can't use the array/asarray functions!
This would also prevent error-handling, what if the developer intentionally wants a NumPy-like array (e.g., the original array passed to the duckarray function) or an exception (instead of coercing to a NumPy array)?
I'm really confused now -- if a end-user wants a duckarray, they should call duckarray() -- if they want an actual numpy array, they should call .asarray(). Why would anyone want an Exception? If you don't want an array, then don't call asarray() If you call duckarray(), and the object has not implemented __duckarray__, then you will get an exception -- whoch you should. If you call __array_, and __array__ has not been implimented, then you will get an exception. what is the potential problem here? Which makes me think -- why should Duck arrays ever implement an __array__ method that raises an Exception? why not jsut not impliment it? (unless you wantt o add some helpful error message -- which I did for the example in my PR. (PR to the numpy repo in progress) -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov
participants (6)
-
Charles R Harris
-
Chris Barker
-
Peter Andreas Entschev
-
Ralf Gommers
-
Sebastian Berg
-
Stephan Hoyer