![](https://secure.gravatar.com/avatar/ad13088a623822caf74e635a68a55eae.jpg?s=120&d=mm&r=g)
On Sun, Aug 18, 2013 at 7:14 PM, Joe Kington <joferkington@gmail.com> wrote:
Hi everyone,
I've recently put together a pull request that adds an `axis` kwarg to `numpy.unique` so that `unique`can easily be used to find unique rows/columns/sub-arrays/etc of a larger array.
https://github.com/numpy/numpy/pull/3584
Currently, this works as a warpper around `unique`. If `axis` is specified, it reshapes the input to a 2D contiguous array, views each row as a single item, then passes it on to `unique`. For int and string dtypes, each row is viewed as a void dtype and therefore bitwise-equality is used for comparisons. For all other dtypes, the each row is viewed as a structured array.
The current implementation has two main drawbacks:
For anything other than ints and strings, it's relatively slow. It doesn't work with object arrays of any sort.
I'd appreciate any thoughts/feedback folks might have on both the general idea and this specific implementation. It think it's a worthwhile addition, but I'm biased.
just a general comment I have been missing a `unique_rows` or something like that, which seems to be the target of this change. However, my first interpretation of an axis argument in unique would be that it treats each column (or whatever along axis) separately. Analogously to max, argmax and similar. On second thought: unique with axis working on each column separately wouldn't create a nice return array, because it won't be rectangular (in general) Josef
Thanks! -Joe
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion