This function is really useful but it seems to only take tuples not ndarrays. This seems kinda strange. Does one have to convert the ndarray into a tuple to use it? This seems extremely inefficient. Is there an efficient way to argsort a 2d array based upon multiple columns if lexsort is not the correct way to do this? The only way I have found to do this is to construct a list of tuples and sort them using python's list sort. This is inefficient and convoluted so I was hoping lexsort would provide a simple solution. --Tom
Tom Denniston wrote:
This function is really useful but it seems to only take tuples not ndarrays. This seems kinda strange. Does one have to convert the ndarray into a tuple to use it? This seems extremely inefficient. Is there an efficient way to argsort a 2d array based upon multiple columns if lexsort is not the correct way to do this? The only way I have found to do this is to construct a list of tuples and sort them using python's list sort. This is inefficient and convoluted so I was hoping lexsort would provide a simple solution.
I've just changed lexsort to accept any sequence object as keys. This means that it can now be used to sort a 2d array (of the same data-type) based on multiple rows. The sorting will be so that the last row is sorted with any repeats sorted by the second-to-last row and remaining repeats sorted by the third-to-last row and so forth... The return value is an array of indices. For the 2d example you can use ind = lexsort(a) sorted = a[:,ind] # or a.take(ind,axis=-1) Example:
a = array([[1,3,2,2,3,3],[4,5,4,6,4,3]]) ind = lexsort(a) sorted = a.take(ind,axis=-1) sorted array([[3, 1, 2, 3, 3, 2], [3, 4, 4, 4, 5, 6]]) a array([[1, 3, 2, 2, 3, 3], [4, 5, 4, 6, 4, 3]])
-Travis
This is great! Many thanks Travis. I can't wait for the next release! --Tom On 6/1/06, Travis Oliphant <oliphant.travis@ieee.org> wrote:
Tom Denniston wrote:
This function is really useful but it seems to only take tuples not ndarrays. This seems kinda strange. Does one have to convert the ndarray into a tuple to use it? This seems extremely inefficient. Is there an efficient way to argsort a 2d array based upon multiple columns if lexsort is not the correct way to do this? The only way I have found to do this is to construct a list of tuples and sort them using python's list sort. This is inefficient and convoluted so I was hoping lexsort would provide a simple solution.
I've just changed lexsort to accept any sequence object as keys. This means that it can now be used to sort a 2d array (of the same data-type) based on multiple rows. The sorting will be so that the last row is sorted with any repeats sorted by the second-to-last row and remaining repeats sorted by the third-to-last row and so forth...
The return value is an array of indices. For the 2d example you can use
ind = lexsort(a) sorted = a[:,ind] # or a.take(ind,axis=-1)
Example:
a = array([[1,3,2,2,3,3],[4,5,4,6,4,3]]) ind = lexsort(a) sorted = a.take(ind,axis=-1) sorted array([[3, 1, 2, 3, 3, 2], [3, 4, 4, 4, 5, 6]]) a array([[1, 3, 2, 2, 3, 3], [4, 5, 4, 6, 4, 3]])
-Travis
Tom, The list -- nee tuple, thanks Travis -- is the list of key sequences and each key sequence can be a column in a matrix. So for instance if you wanted to sort on a few columns of a matrix, say columns 2,1, and 0, in that order, and then rearrange the rows so the columns were ordered, you would do something like:
a = randint(0,2,(7,4)) a array([[0, 0, 0, 1], [0, 0, 1, 0], [1, 0, 0, 1], [0, 1, 0, 1], [1, 1, 1, 0], [0, 1, 1, 1], [0, 1, 0, 1]]) ind = lexsort((a[:,2],a[:,1],a[:,0])) sorted = a[ind] sorted array([[0, 0, 0, 1], [0, 0, 1, 0], [0, 1, 0, 1], [0, 1, 0, 1], [0, 1, 1, 1], [1, 0, 0, 1], [1, 1, 1, 0]])
Note that the last key defines the major order. Chuck On 6/1/06, Tom Denniston <tom.denniston@alum.dartmouth.org> wrote:
This function is really useful but it seems to only take tuples not ndarrays. This seems kinda strange. Does one have to convert the ndarray into a tuple to use it? This seems extremely inefficient. Is there an efficient way to argsort a 2d array based upon multiple columns if lexsort is not the correct way to do this? The only way I have found to do this is to construct a list of tuples and sort them using python's list sort. This is inefficient and convoluted so I was hoping lexsort would provide a simple solution.
--Tom
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
participants (3)
-
Charles R Harris -
Tom Denniston -
Travis Oliphant