
On Fri, Apr 24, 2020 at 6:31 AM Sebastian Berg <sebastian@sipsolutions.net> wrote:
One thing to note is that `__array__` is actually asked to return a copy AFAIK.
The documentation on __array__ seems to quite limited, unfortunately. The most I can find are a few sentences here: https://numpy.org/doc/stable/reference/arrays.classes.html#numpy.class.__arr... I don't see anything about returning copies. My interpretation has always been that __array__ can return either a copy or a view, like the np.asarray() constructor.
I doubt it always does, but if it does not I assume the object should and could provide `__array_interface__`.
Objects like xarray.DataArray and pandas.Series sometimes directly wrap NumPy arrays and sometimes don't. They both implement __array__ but not __array_inferace__. It's very obvious how to implement a "forwarding" __array__ method (just call `np.asarray()` on an argument that might implement it). I guess something similar could be done for __array_interface__, but it's not clear to me that it's right to implement __array_interface__ when doing so might require a copy.
Under that assumption, it would be an opt-out right now since NumPy allows copies by default here. Defining things along copy does seem sensible, though I do not know how it would play with some of the current array-likes choosing to refuse `__array__`.
- Sebastian
Eric
On Fri, 24 Apr 2020 at 03:00, Juan Nunez-Iglesias <jni@fastmail.com> wrote:
Hi everyone,
One bit of expressivity we would miss is “copy if necessary, but otherwise
don’t bother”, but there are workarounds to this.
After a side discussion with Stéfan van der Walt, we came up with `allow_copy=True`, which would express to the downstream library that we don’t mind waiting, but that zero-copy would also be ok.
This sounds like the sort of thing that is use case driven. If enough projects want to use it, then I have no objections to adding the keyword. OTOH, we need to be careful about adding too many interoperability tricks as they complicate the code and makes it hard for folks to determine the best solution. Interoperability is a hot topic and we need to be careful not put too leave behind too many experiments in the NumPy code. Do you have any other ideas of how to achieve the same effect?
Personally, I don’t have any other ideas, but would be happy to hear some!
My view regarding API/experiment creep is that `__array__` is the oldest and most basic of all the interop tricks and that this can be safely maintained for future generations. Currently it only takes `dtype=` as a keyword argument, so it is a very lean API. I think this particular use case is very natural and I’ve encountered the reluctance to implicitly copy twice, so I expect it is reasonably common.
Regarding difficulty in determining the best solution, I would be happy to contribute to the dispatch basics guide together with the new kwarg. I agree that the protocols are getting quite numerous and I couldn’t find a single place that gathers all the best practices together. But, to reiterate my point: `__array__` is the simplest of these and I think this keyword is pretty safe to add.
For ease of discussion, here are the API options discussed so far, as well as a few extra that I don’t like but might trigger other ideas:
np.asarray(my_duck_array, allow_copy=True) # default is False, or None -> leave it to the duck array to decide np.asarray(my_duck_array, copy=True) # always copies, but, if supported by the duck array, defers to it for the copy np.asarray(my_duck_array, copy=‘allow’) # could take values ‘allow’, ‘force’, ’no’, True(=‘force’), False(=’no’) np.asarray(my_duck_array, force_copy=False, allow_copy=True) # separate concepts, but unclear what force_copy=True, allow_copy=False means! np.asarray(my_duck_array, force=True)
Juan. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion