It is pretty weird that these two statements don't necessarily produce the same result:

someufunc(*inputs, out=out_arr)
out_arr[...] = someufunc(*inputs)

On Fri, Sep 27, 2019, 15:02 Sebastian Berg <sebastian@sipsolutions.net> wrote:
On Fri, 2019-09-27 at 11:50 -0700, Sebastian Berg wrote:
> Hi all,
>
> Looking at the ufunc dispatching rules with an `out` argument, I was
> a
> bit surprised to realize this little gem is how things work:
>
> ```
> arr = np.arange(10, dtype=np.uint16) + 2**15
> print(arr)
> # array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18], dtype=uint16)
>

Whoops, copied that print wrong of course.

Just to be clear, I personally will consider this an accuracy/precision
bug and assume that we can just switch the behaviour failry
unceremoniously at some point (and if someone feels that should be a
major release, I do not mind).
It seems like one of those things that will definitely fix some bugs
but could break the odd system/assumption somewhere. Similar to fixing
the memory overlap issues.

- Sebastian


> out = np.zeros(10)
>
> np.add(arr, arr, out=out)
> print(repr(out))
> # array([ 0.,  2.,  4.,  6.,  8., 10., 12., 14., 16., 18.])
> ```
>
> This is strictly speaking correct/consistent. What the ufunc tries to
> ensure is that whatever the loop produces fits into `out`.
> However, I still find it unexpected that it does not pick the full
> precision loop.
>
> There is currently only one way to achieve that, and this by using
> `dtype=out.dtype` (or similar incarnations) which specify the exact
> dtype [0].
>
> Of course this is also because I would like to simplify things for a
> new dispatching system, but I would like to propose to disable the
> above behaviour. This would mean:
>
> ```
> # make the call:
> np.add(arr, arr, out=out)
>
> # Equivalent to the current [1]:
> np.add(arr, arr, out=out, dtype=(None, None, out.dtype))
>
> # Getting the old behaviour requires (assuming inputs have same
> dtype):
> np.add(arr, arr, out=out, dtypes=arr.dtype)
> ```
>
> and thus force the high precision loop. In very rare cases, this
> could
> lead to no loop being found.
>
> The main incompatibility is if someone actually makes use of the
> above
> (integer over/underflow) behaviour, but wants to store it in a higher
> precision array.
>
> I personally currently think we should change it, but am curious if
> we
> think that we may be able to get away with an accelerate process and
> not a year long FutureWarning.
>
> Cheers,
>
> Sebastian
>
>
> [0] You can also use `casting="no"` but in all relevant cases that
> should find no loop, since the we typically only have homogeneous
> loop
> definitions, and
>
> [1] Which is normally the same as the shorter spelling
> `dtype=out.dtype` of course.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@python.org
https://mail.python.org/mailman/listinfo/numpy-discussion