[Numpy-discussion] strange behavior of numpy.unique

Warren Weckesser warren.weckesser at gmail.com
Tue Nov 6 21:52:24 EST 2012

On Tue, Nov 6, 2012 at 8:27 PM, Phillip Feldman <phillip.m.feldman at gmail.com
> wrote:

> numpy.unique behaves as I would expect for small inputs like the following:
> In [12]: x= [0, 0, 1, 0, 1, 2, 0, 1, 2, 3]
> In [13]: unique(x, return_index=True)
> Out[13]: (array([0, 1, 2, 3]), array([0, 2, 5, 9], dtype=int64))
> But, when I give it something larger, the return index values do not
> always correspond to the first occurrences in the input. The documentation
> is silent on the question of how the return index values are chosen when a
> given element of x appears more than once. Either the documentation should
> be
> clarified, or better yet, the behavior should be changed.

In fact, it was changed (in the master branch on github) several months
ago, but there has not yet been a release with the changes.  The sort
method that np.unique passes to np.argsort is now 'mergesort', and the
docstring states that the indices returned are for the first occurrences of
the unique elements.  The new docstring is here:

the commit, and
https://github.com/numpy/numpy/blob/master/numpy/lib/arraysetops.py for the
latest version of the source.


> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20121106/c18cee8f/attachment.html>

More information about the NumPy-Discussion mailing list