[Numpy-discussion] unique() should return a sorted array

Norbert Nemec Norbert.Nemec.list at gmx.de
Tue Jul 11 11:00:50 EDT 2006


unique1d is based on ediff1d, so it really calculates many differences
and compares those to 0.0

This is inefficient, even though this is hidden by the general
inefficiency of Python (It might be the reason for the two milliseconds,
though)

What is more: subtraction works only for numbers, while the various
proposed versions use only comparison which works for any data type (as
long as it can be sorted)

My own version tried to capture all possible cases that the current
unique captures.

Sasha's version only works for numpy arrays and has a problem for arrays
with all identical entries.

David's version only works for numpy arrays of types that can be
converted to float.

I would once more propose to use my own version as given before:

def unique(arr,sort=True):
    if hasattr(arr,'flatten'):
        tmp = arr.flatten()
        tmp.sort()
        idx = concatenate([True],tmp[1:]!=tmp[:-1])
        return tmp[idx]
    else: # for compatibility:
        set = {}
        for item in inseq:
            set[item] = None
        if sort:
            return asarray(sorted(set.keys()))
       else:
            return asarray(set.keys())


Greetings,
Norbert





More information about the NumPy-Discussion mailing list