Norbert Nemec wrote:
unique1d is based on ediff1d, so it really calculates many differences and compares those to 0.0
This is inefficient, even though this is hidden by the general inefficiency of Python (It might be the reason for the two milliseconds, though)
What is more: subtraction works only for numbers, while the various proposed versions use only comparison which works for any data type (as long as it can be sorted)
I agree that unique1d works only for numbers, but that is what it was meant for... well for integers only, in fact - everyone here surely knows, that comparing floats with != does not work well. Note also that it was written before logical indexing and other neat stuff were not possible in numpy - every improvement is welcome! (btw. I cannot recall why I used subtraction and testing for zero instead of just comparisons - maybe remnants from my old matlab days and its logical arrays - ediff1d should disappear from the other functions in arraysetops)
My own version tried to capture all possible cases that the current unique captures.
Sasha's version only works for numpy arrays and has a problem for arrays with all identical entries.
David's version only works for numpy arrays of types that can be converted to float.
comparing floats...
I would once more propose to use my own version as given before:
def unique(arr,sort=True): if hasattr(arr,'flatten'): tmp = arr.flatten() tmp.sort() idx = concatenate([True],tmp[1:]!=tmp[:-1]) return tmp[idx] else: # for compatibility: set = {} for item in inseq: set[item] = None if sort: return asarray(sorted(set.keys())) else: return asarray(set.keys())
Have you considered using set instead of dict? Just curious :-) r.