[Numpy-discussion] Finding unique rows in an array [Was: Finding a row match within a numpy array]

Francesc Altet faltet at carabos.com
Wed Aug 22 05:11:16 EDT 2007


A Tuesday 21 August 2007, Mark.Miller escrigué:
> A slightly related question on this topic...
>
> Is there a good loopless way to identify all of the unique rows in an
> array?  Something like numpy.unique() is ideal, but capable of
> extracting unique subarrays along an axis.

You can always do a view of the rows as strings and then use unique().
Here is an example:

In [1]: import numpy
In [2]: a=numpy.arange(12).reshape(4,3)
In [3]: a[2]=(3,4,5)
In [4]: a
Out[4]: 
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 3,  4,  5],
       [ 9, 10, 11]])

now, create the view and select the unique rows:

In [5]: b=numpy.unique(a.view('S%d'%a.itemsize*a.shape[0])).view('i4')

and finally restore the shape:

In [6]: b.reshape((len(b)/a.shape[1], a.shape[1]))
Out[6]: 
array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 9, 10, 11]])

If you want to find unique columns instead of rows, do a tranpose first 
on the initial array.

Cheers,

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"



More information about the NumPy-Discussion mailing list