Re: [Numpy-discussion] Revert the return of a single NaN for `np.unique` with floating point numbers?

Aug. 3, 2021


      On 2/8/21 8:49 pm, Ralf Gommers wrote:
...
On Mon, Aug 2, 2021 at 7:04 PM Sebastian Berg 
<sebastian@sipsolutions.net <mailto:sebastian@sipsolutions.net>> wrote:
Hi all,
In NumPy 1.21, the output of `np.unique` changed in the presence of
    multiple NaNs.  Previously, all NaNs were returned when we now only
    return one (all NaNs were considered unique):
    a = np.array([1, 1, np.nan, np.nan, np.nan])
Before 1.21:
    >>> np.unique(a)
        array([ 1., nan, nan, nan])
After 1.21:
    array([ 1., nan])
This change was requested in an old issue:
https://github.com/numpy/numpy/issues/2111
    <https://github.com/numpy/numpy/issues/2111>
And happened here:
https://github.com/numpy/numpy/pull/18070
    <https://github.com/numpy/numpy/pull/18070>
While, it has a release note.  I am not sure the change got the
    attention it deserved.  This would be especially worrying if it is a
    regression for anyone?
I think it's now the expected answer, not a regression. `unique` is 
not an elementwise function that needs to adhere to IEEE-754 where nan 
!= nan. I can't remember reviewing this change, but it makes perfect 
sense to me.
Cheers,
Ralf
We were discussing this today (me and Matthew) and came up with an edge 
case when using set(a), it will return the old value. We should add this 
as a documented "feature"

Matti

Re: [Numpy-discussion] Revert the return of a single NaN for `np.unique` with floating point numbers?

Matti Picus