
On 2/8/21 8:49 pm, Ralf Gommers wrote:
On Mon, Aug 2, 2021 at 7:04 PM Sebastian Berg <sebastian@sipsolutions.net <mailto:sebastian@sipsolutions.net>> wrote:
Hi all,
In NumPy 1.21, the output of `np.unique` changed in the presence of multiple NaNs. Previously, all NaNs were returned when we now only return one (all NaNs were considered unique):
a = np.array([1, 1, np.nan, np.nan, np.nan])
Before 1.21:
>>> np.unique(a) array([ 1., nan, nan, nan])
After 1.21:
array([ 1., nan])
This change was requested in an old issue:
https://github.com/numpy/numpy/issues/2111 <https://github.com/numpy/numpy/issues/2111>
And happened here:
https://github.com/numpy/numpy/pull/18070 <https://github.com/numpy/numpy/pull/18070>
While, it has a release note. I am not sure the change got the attention it deserved. This would be especially worrying if it is a regression for anyone?
I think it's now the expected answer, not a regression. `unique` is not an elementwise function that needs to adhere to IEEE-754 where nan != nan. I can't remember reviewing this change, but it makes perfect sense to me.
Cheers, Ralf
We were discussing this today (me and Matthew) and came up with an edge case when using set(a), it will return the old value. We should add this as a documented "feature" Matti