Great! It works much more efficiently. Thank you so much.
Best,
Chunlei
Tim Hochberg wrote:
CL WU wrote:
Thank you, Tim. argsort() and take() does provide a easy way to sort an array based on any col or row. But for the second question, it doesn't return the result I want. As below, softrank or softrank1 are functions I am currently using for get the rank of a vector(first is more efficient). It returns the index of each value from original array/list in sorted array/list.
Hmmm. It seems that argsort and sortrank are inverses of a sort, so it should be possible to do what you want efficiently, but I'm not sure how.
<think>
Ah, it appears to be quite simple. I believe:
argsort(argsort(a))
is equivalent to your sortrank and should be much faster.
regards,
tim
I hope there is an efficient function in array level to do the same work.
from Numeric import * a=array([5,2,3]) argsort(a)
array([1, 2, 0])
def sortrank(list):
... n=len(list) ... li_a=[(i,list[i]) for i in range(n)] ... li_a.sort(lambda a,b:cmp(a[1],b[1])) ... li_b=[(i,li_a[i]) for i in range(n)] ... li_b.sort(lambda a,b:cmp(a[1][0],b[1][0])) ... return [x[0] for x in li_b] ... >>> sortrank(a) [2, 0, 1]
def sortrank2(li):
... li_sorted=li[:] ... li_sorted.sort() ... return [li_sorted.index(x) for x in li]
sortrank1(list(a))
[2, 0, 1]
Thanks again.
Chunlei
Tim Hochberg wrote:
CL WU wrote:
Hi, group, I am new to numpy. I have 2 questions for array sort.
 How to sort an array by its one column or one row? I know python buildin sort() can do it for list by passing own
cmp function. but array function sort() will sort each column or row seperately,as I know. I don't want to convert array to list to sort and then convert back to array.
I think you want argsort plus take. For example, the following sorts on the second column of a:
a = array([[4,5,6], [1,2,3], [7,8,9]]) arg = argsort(a[:,1]) take(a, arg, 0)
 How to get the rank of a rank0 array? The first "rank" means
the order of each element after sorting, instead of the "dimension" meaning in numpy. Just like "rank()" function in splus.
If I understand you correctly, you want argsort as mentioned above.
Regards,
tim
Thank you
Chunlei
This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Numpydiscussion mailing list Numpydiscussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpydiscussion
This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Numpydiscussion mailing list Numpydiscussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpydiscussion
This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Numpydiscussion mailing list Numpydiscussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpydiscussion
I'm just starting to move some of my code over to numarray and I was dismayed to find that basic operation between Numeric and numarray arrays fail.
import Numeric as np import numarray as na a = na.arange(5) p = np.arange(5) a + p
['vector', 'vector'] Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\Lib\sitepackages\numarray\numarraycore.py", line 648, in __add__ def __add__(self, operand): return ufunc.add(self, operand) File "C:\Python23\lib\sitepackages\numarray\ufunc.py", line 818, in _cache_miss2 key = (_digest(n1), _digest(n2), _digest(out), safethread.get_ident()) KeyError: '_digest force cache miss'
I suspect (hope!) that this is just a bug and not something inherent in numarray. I dug around in unfunc.py a bit and it appears that the bug is shallow and can be fixed simply by replacing::
if not (_sequence(n1) or _sequence(n2)): key = (_digest(n1), _digest(n2), _digest(out), safethread.get_ident()) self._cache[ key ] = cached
with::
try: key = (_digest(n1), _digest(n2), _digest(out), safethread.get_ident()) except KeyError: pass else: self._cache[ key ] = cached
in _cache_miss2 and _cache_miss1. If this were done, _sequence could probably be deleted as well.
I'm not very familiar with the numarray code yet, so it's quite possible I'm missing something, but I'm willing to do more digging to fix this if this turns out to not be sufficient.
Regards,
tim
On Thu, 20030918 at 13:53, Tim Hochberg wrote:
I'm just starting to move some of my code over to numarray and I was dismayed to find that basic operation between Numeric and numarray arrays fail.
import Numeric as np import numarray as na a = na.arange(5) p = np.arange(5) a + p
['vector', 'vector'] Traceback (most recent call last): File "<stdin>", line 1, in ? File "C:\Python23\Lib\sitepackages\numarray\numarraycore.py", line 648, in __add__ def __add__(self, operand): return ufunc.add(self, operand) File "C:\Python23\lib\sitepackages\numarray\ufunc.py", line 818, in _cache_miss2 key = (_digest(n1), _digest(n2), _digest(out), safethread.get_ident()) KeyError: '_digest force cache miss'
I suspect (hope!) that this is just a bug and not something inherent in numarray.
It's an interoperability issue. Please let us know if you find others.
I dug around in unfunc.py a bit and it appears that the bug is shallow and can be fixed simply by replacing::
if not (_sequence(n1) or _sequence(n2)): key = (_digest(n1), _digest(n2), _digest(out),
safethread.get_ident()) self._cache[ key ] = cached
with::
try: key = (_digest(n1), _digest(n2), _digest(out),
safethread.get_ident()) except KeyError: pass else: self._cache[ key ] = cached
in _cache_miss2 and _cache_miss1. If this were done, _sequence could probably be deleted as well.
I'm not very familiar with the numarray code yet, so it's quite possible I'm missing something, but I'm willing to do more digging to fix this if this turns out to not be sufficient.
I ran into the same problem trying to port MA to numarray, and came up with an identical work around.
A fix like this will be part of numarray0.8.
Todd
Hi Chunlei,
I just realized one other thing that you should probably be aware of. You could write a much faster version of sortrank in pure python by doing your sorts differently. Python's built in sort is very fast, but as soon as you start passing in comparison functions it slows down dramatically. The trick is to arange the data you need to sort so that you don't need an auxilliary function (know asDecorateSortUndecorate or the Schwartzian transform). Thus, the following is almost certainly a lot faster than your original sortrank, although probably still slower than the argsort solution.
def sortrank(list): index = range(len(list)) li_a = zip(list, index) li_a.sort() li_b = [(li_a[i][1], i) for i in index] li_b.sort() return [x[1] for x in li_b]
Regards,
tim
CL WU wrote:
I hope there is an efficient function in array level to do the same work.
from Numeric import * a=array([5,2,3]) argsort(a)
array([1, 2, 0])
def sortrank(list):
... n=len(list) ... li_a=[(i,list[i]) for i in range(n)] ... li_a.sort(lambda a,b:cmp(a[1],b[1])) ... li_b=[(i,li_a[i]) for i in range(n)] ... li_b.sort(lambda a,b:cmp(a[1][0],b[1][0])) ... return [x[0] for x in li_b] ... >>> sortrank(a) [2, 0, 1]
def sortrank2(li):
... li_sorted=li[:] ... li_sorted.sort() ... return [li_sorted.index(x) for x in li]
sortrank1(list(a))
[2, 0, 1]
Thanks again.
Chunlei
Tim Hochberg wrote:
CL WU wrote:
Hi, group, I am new to numpy. I have 2 questions for array sort.
 How to sort an array by its one column or one row? I know python buildin sort() can do it for list by passing own
cmp function. but array function sort() will sort each column or row seperately,as I know. I don't want to convert array to list to sort and then convert back to array.
I think you want argsort plus take. For example, the following sorts on the second column of a:
a = array([[4,5,6], [1,2,3], [7,8,9]]) arg = argsort(a[:,1]) take(a, arg, 0)
 How to get the rank of a rank0 array? The first "rank" means
the order of each element after sorting, instead of the "dimension" meaning in numpy. Just like "rank()" function in splus.
If I understand you correctly, you want argsort as mentioned above.
Regards,
tim
Thank you
Chunlei
This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Numpydiscussion mailing list Numpydiscussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpydiscussion
This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Numpydiscussion mailing list Numpydiscussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpydiscussion
This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Numpydiscussion mailing list Numpydiscussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpydiscussion
Thanks again, Tim. It a wonderful example to show how efficient python can run if it is well written.
Best,
Chunlei
Tim Hochberg wrote:
Hi Chunlei,
I just realized one other thing that you should probably be aware of. You could write a much faster version of sortrank in pure python by doing your sorts differently. Python's built in sort is very fast, but as soon as you start passing in comparison functions it slows down dramatically. The trick is to arange the data you need to sort so that you don't need an auxilliary function (know asDecorateSortUndecorate or the Schwartzian transform). Thus, the following is almost certainly a lot faster than your original sortrank, although probably still slower than the argsort solution.
def sortrank(list): index = range(len(list)) li_a = zip(list, index) li_a.sort() li_b = [(li_a[i][1], i) for i in index] li_b.sort() return [x[1] for x in li_b]
Regards,
tim
CL WU wrote:
I hope there is an efficient function in array level to do the same work.
> from Numeric import * > a=array([5,2,3]) > argsort(a)
array([1, 2, 0])
> def sortrank(list):
... n=len(list) ... li_a=[(i,list[i]) for i in range(n)] ... li_a.sort(lambda a,b:cmp(a[1],b[1])) ... li_b=[(i,li_a[i]) for i in range(n)] ... li_b.sort(lambda a,b:cmp(a[1][0],b[1][0])) ... return [x[0] for x in li_b] ... >>> sortrank(a) [2, 0, 1]
> def sortrank2(li):
... li_sorted=li[:] ... li_sorted.sort() ... return [li_sorted.index(x) for x in li]
> sortrank1(list(a))
[2, 0, 1]
Thanks again.
Chunlei
Tim Hochberg wrote:
CL WU wrote:
Hi, group, I am new to numpy. I have 2 questions for array sort.
 How to sort an array by its one column or one row? I know python buildin sort() can do it for list by passing
own cmp function. but array function sort() will sort each column or row seperately,as I know. I don't want to convert array to list to sort and then convert back to array.
I think you want argsort plus take. For example, the following sorts on the second column of a:
a = array([[4,5,6], [1,2,3], [7,8,9]]) arg = argsort(a[:,1]) take(a, arg, 0)
 How to get the rank of a rank0 array? The first "rank" means
the order of each element after sorting, instead of the "dimension" meaning in numpy. Just like "rank()" function in splus.
If I understand you correctly, you want argsort as mentioned above.
Regards,
tim
Thank you
Chunlei
This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Numpydiscussion mailing list Numpydiscussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpydiscussion
This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Numpydiscussion mailing list Numpydiscussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpydiscussion
This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ Numpydiscussion mailing list Numpydiscussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpydiscussion
participants (3)

CL WU

Tim Hochberg

Todd Miller