[Numpy-discussion] lexsort

Travis Oliphant oliphant.travis at ieee.org
Thu Jun 1 19:32:03 EDT 2006


Tom Denniston wrote:
> This function is really useful but it seems to only take tuples not
> ndarrays.   This seems kinda strange.  Does one have to convert the
> ndarray into a tuple to use it?  This seems extremely inefficient.  Is
> there an efficient way to argsort a 2d array based upon multiple
> columns if lexsort is not the correct way to do this?  The only way I
> have found to do this is to construct a list of tuples and sort them
> using python's list sort.  This is inefficient and convoluted so I was
> hoping lexsort would provide a simple solution.
>   

I've just changed lexsort to accept any sequence object as keys.   This 
means that it can now be used to sort a 2d array (of the same data-type) 
based on multiple rows.  The sorting will be so that the last row is 
sorted with any repeats sorted by the second-to-last row and remaining 
repeats sorted by the third-to-last row and so forth...

The return value is an array of indices.   For the 2d example you can use

ind = lexsort(a)
sorted = a[:,ind]   # or a.take(ind,axis=-1)


Example:

 >>> a = array([[1,3,2,2,3,3],[4,5,4,6,4,3]])
 >>> ind = lexsort(a)
 >>> sorted = a.take(ind,axis=-1)
 >>> sorted
array([[3, 1, 2, 3, 3, 2],
       [3, 4, 4, 4, 5, 6]])
 >>> a
array([[1, 3, 2, 2, 3, 3],
       [4, 5, 4, 6, 4, 3]])



-Travis






More information about the NumPy-Discussion mailing list