[Numpy-discussion] Proposal: add `force=` or `copy=` kwarg to `__array__` interface

Eric Wieser wieser.eric+numpy at gmail.com
Fri Apr 24 06:34:28 EDT 2020


Perhaps worth mentioning that we've discussed this sort of API before, in
https://github.com/numpy/numpy/pull/11897.

Under that proposal, the api would be something like:

* `copy=True` - always copy, like it is today
* `copy=False` - copy if needed, like it is today
* `copy=np.never_copy` - never copy, throw an exception if not possible

I think the discussion stalled on the precise spelling of the third option.

`__array__` was not discussed there, but it seems like adding the `copy`
argument to `__array__` would be a perfectly reasonable extension.

Eric

On Fri, 24 Apr 2020 at 03:00, Juan Nunez-Iglesias <jni at fastmail.com> wrote:

> Hi everyone,
>
> One bit of expressivity we would miss is “copy if necessary, but otherwise
>> don’t bother”, but there are workarounds to this.
>>
>
> After a side discussion with Stéfan van der Walt, we came up with
> `allow_copy=True`, which would express to the downstream library that we
> don’t mind waiting, but that zero-copy would also be ok.
>
> This sounds like the sort of thing that is use case driven. If enough
> projects want to use it, then I have no objections to adding the keyword.
> OTOH, we need to be careful about adding too many interoperability tricks
> as they complicate the code and makes it hard for folks to determine the
> best solution. Interoperability is a hot topic and we need to be careful
> not put too leave behind too many experiments in the NumPy code.  Do you
> have any other ideas of how to achieve the same effect?
>
>
> Personally, I don’t have any other ideas, but would be happy to hear some!
>
> My view regarding API/experiment creep is that `__array__` is the oldest
> and most basic of all the interop tricks and that this can be safely
> maintained for future generations. Currently it only takes `dtype=` as a
> keyword argument, so it is a very lean API. I think this particular use
> case is very natural and I’ve encountered the reluctance to implicitly copy
> twice, so I expect it is reasonably common.
>
> Regarding difficulty in determining the best solution, I would be happy to
> contribute to the dispatch basics guide together with the new kwarg. I
> agree that the protocols are getting quite numerous and I couldn’t find a
> single place that gathers all the best practices together. But, to
> reiterate my point: `__array__` is the simplest of these and I think this
> keyword is pretty safe to add.
>
> For ease of discussion, here are the API options discussed so far, as well
> as a few extra that I don’t like but might trigger other ideas:
>
> np.asarray(my_duck_array, allow_copy=True)  # default is False, or None ->
> leave it to the duck array to decide
> np.asarray(my_duck_array, copy=True)  # always copies, but, if supported
> by the duck array, defers to it for the copy
> np.asarray(my_duck_array, copy=‘allow’)  # could take values ‘allow’,
> ‘force’, ’no’, True(=‘force’), False(=’no’)
> np.asarray(my_duck_array, force_copy=False, allow_copy=True)  # separate
> concepts, but unclear what force_copy=True, allow_copy=False means!
> np.asarray(my_duck_array, force=True)
>
> Juan.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20200424/8bc04b9f/attachment.html>


More information about the NumPy-Discussion mailing list