[Numpy-discussion] unique() should return a sorted array
Norbert Nemec
Norbert.Nemec.list at gmx.de
Tue Jul 11 11:00:50 EDT 2006
unique1d is based on ediff1d, so it really calculates many differences
and compares those to 0.0
This is inefficient, even though this is hidden by the general
inefficiency of Python (It might be the reason for the two milliseconds,
though)
What is more: subtraction works only for numbers, while the various
proposed versions use only comparison which works for any data type (as
long as it can be sorted)
My own version tried to capture all possible cases that the current
unique captures.
Sasha's version only works for numpy arrays and has a problem for arrays
with all identical entries.
David's version only works for numpy arrays of types that can be
converted to float.
I would once more propose to use my own version as given before:
def unique(arr,sort=True):
if hasattr(arr,'flatten'):
tmp = arr.flatten()
tmp.sort()
idx = concatenate([True],tmp[1:]!=tmp[:-1])
return tmp[idx]
else: # for compatibility:
set = {}
for item in inseq:
set[item] = None
if sort:
return asarray(sorted(set.keys()))
else:
return asarray(set.keys())
Greetings,
Norbert
More information about the NumPy-Discussion
mailing list