[Numpy-discussion] unique() should return a sorted array
Robert Cimrman
cimrman3 at ntc.zcu.cz
Tue Jul 11 12:49:41 EDT 2006
Norbert Nemec wrote:
> unique1d is based on ediff1d, so it really calculates many differences
> and compares those to 0.0
>
> This is inefficient, even though this is hidden by the general
> inefficiency of Python (It might be the reason for the two milliseconds,
> though)
>
> What is more: subtraction works only for numbers, while the various
> proposed versions use only comparison which works for any data type (as
> long as it can be sorted)
I agree that unique1d works only for numbers, but that is what it was
meant for... well for integers only, in fact - everyone here surely
knows, that comparing floats with != does not work well.
Note also that it was written before logical indexing and other neat
stuff were not possible in numpy - every improvement is welcome!
(btw. I cannot recall why I used subtraction and testing for zero
instead of just comparisons - maybe remnants from my old matlab days and
its logical arrays - ediff1d should disappear from the other functions
in arraysetops)
> My own version tried to capture all possible cases that the current
> unique captures.
>
> Sasha's version only works for numpy arrays and has a problem for arrays
> with all identical entries.
>
> David's version only works for numpy arrays of types that can be
> converted to float.
comparing floats...
> I would once more propose to use my own version as given before:
>
> def unique(arr,sort=True):
> if hasattr(arr,'flatten'):
> tmp = arr.flatten()
> tmp.sort()
> idx = concatenate([True],tmp[1:]!=tmp[:-1])
> return tmp[idx]
> else: # for compatibility:
> set = {}
> for item in inseq:
> set[item] = None
> if sort:
> return asarray(sorted(set.keys()))
> else:
> return asarray(set.keys())
Have you considered using set instead of dict? Just curious :-)
r.
More information about the NumPy-Discussion
mailing list