On Fri, 2019-09-27 at 11:50 -0700, Sebastian Berg wrote:
Hi all,
Looking at the ufunc dispatching rules with an `out` argument, I was a bit surprised to realize this little gem is how things work:
``` arr = np.arange(10, dtype=np.uint16) + 2**15 print(arr) # array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18], dtype=uint16)
Whoops, copied that print wrong of course. Just to be clear, I personally will consider this an accuracy/precision bug and assume that we can just switch the behaviour failry unceremoniously at some point (and if someone feels that should be a major release, I do not mind). It seems like one of those things that will definitely fix some bugs but could break the odd system/assumption somewhere. Similar to fixing the memory overlap issues. - Sebastian
out = np.zeros(10)
np.add(arr, arr, out=out) print(repr(out)) # array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18.]) ```
This is strictly speaking correct/consistent. What the ufunc tries to ensure is that whatever the loop produces fits into `out`. However, I still find it unexpected that it does not pick the full precision loop.
There is currently only one way to achieve that, and this by using `dtype=out.dtype` (or similar incarnations) which specify the exact dtype [0].
Of course this is also because I would like to simplify things for a new dispatching system, but I would like to propose to disable the above behaviour. This would mean:
``` # make the call: np.add(arr, arr, out=out)
# Equivalent to the current [1]: np.add(arr, arr, out=out, dtype=(None, None, out.dtype))
# Getting the old behaviour requires (assuming inputs have same dtype): np.add(arr, arr, out=out, dtypes=arr.dtype) ```
and thus force the high precision loop. In very rare cases, this could lead to no loop being found.
The main incompatibility is if someone actually makes use of the above (integer over/underflow) behaviour, but wants to store it in a higher precision array.
I personally currently think we should change it, but am curious if we think that we may be able to get away with an accelerate process and not a year long FutureWarning.
Cheers,
Sebastian
[0] You can also use `casting="no"` but in all relevant cases that should find no loop, since the we typically only have homogeneous loop definitions, and
[1] Which is normally the same as the shorter spelling `dtype=out.dtype` of course. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion