[Numpy-discussion] Remove duplicate columns

Charles R Harris charlesr.harris at gmail.com
Thu May 6 23:45:25 EDT 2010


On Thu, May 6, 2010 at 11:25 AM, T J <tjhnson at gmail.com> wrote:

> Hi,
>
> Is there a way to sort the columns in an array?  I need to sort it so
> that I can easily go through and keep only the unique columns.
> ndarray.sort(axis=1) doesn't do what I want as it destroys the
> relative ordering between the various columns. For example, I would
> like:
>
> [[2,1,3],
>  [3,5,1],
>  [0,3,1]]
>
> to go to:
>
> [[1,2,3],
>  [5,3,1],
>  [3,0,1]]
>
> (swap the first and second columns).  So I want to treat the columns
> as objects and sort them.  I can do this if I convert to a python
> list, but I was hoping to avoid doing that because I ultimately need
> to do element-wise bitwise operations.
>
>
To get the order illustrated:

In [9]: a = array([[2,1,3],[3,5,1],[0,3,1]])

In [10]: i = lexsort([a[::-1][i] for i in range(3)])

In [11]: a[:,i]
Out[11]:
array([[1, 2, 3],
       [5, 3, 1],
       [3, 0, 1]])


But if you just want them sorted, it is easier to do

In [12]: i = lexsort([a[i] for i in range(3)])

In [13]: a[:,i]
Out[13]:
array([[2, 3, 1],
       [3, 1, 5],
       [0, 1, 3]])

or just

In [18]: a[:,lexsort(a)]
Out[18]:
array([[2, 3, 1],
       [3, 1, 5],
       [0, 1, 3]])

For the bigger array

In [21]: a
Out[21]:
array([[3, 2, 2, 2, 2],
       [2, 2, 0, 2, 2],
       [0, 1, 1, 0, 1],
       [5, 5, 3, 0, 5]])

In [22]: a[:, lexsort(a)]
Out[22]:
array([[2, 2, 3, 2, 2],
       [2, 0, 2, 2, 2],
       [0, 1, 0, 1, 1],
       [0, 3, 5, 5, 5]])

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100506/f2161e63/attachment.html>


More information about the NumPy-Discussion mailing list