
Hi Berthold, I tested your code on Win95 using Numeric 20.0.0 and got essentially the same pattern of times. So, upgrading your version of Numeric is not likely to help. Curious, I checked out the code for sort and found that it justs calls qsort from the C library. I suspect that the problem is related to quicksort having bad worst case behaviour. Not being a computer scientest, I can't tell you under what situations the bad behaviour is triggered, although I know it doesn't like presorted lists. Anyway, if you're using 1D arrays, one workaround would be to use list.sort. Python's sorting routines has, I understand, lots of gimmicks to avoid the problems that quicksort sometimes encounters. I tried it and this: r=(random((71400,))*7).astype(Int) l = r.tolist() l.sort() p = array(l) runs about 40 times faster than this: r=(random((71400,))*7).astype(Int) p = r.sort() If you don't need to convert to and from a array, this approach is 60 times faster. Even if you're dealing with a multidimensional array, this approach (in a loop) might be signifigantly faster assuming you're sorting along the long axis. It makes one wonder if using the python sort rather than qsort for Numeric.sort would be a profitable change. No time to investigate it right now though. Hope that's useful... -tim
We have a speed problem with Numeric.sort on large arrays with only a few different values. Here is my example
-- snip --
cat numtst.py import Numeric print Numeric.__version__ class timer: def __init__(self): import time self.start = time.time() def stop(self): import time print "%.3f" % (time.time() - self.start)
from RandomArray import random from Numeric import sort, Int
r=random((71400,)) t = timer() ; p=sort(r) ; t.stop() r=(random((71400,))*70000).astype(Int) t = timer() ; p=sort(r) ; t.stop() r=(random((71400,))*70).astype(Int) t = timer() ; p=sort(r) ; t.stop() r=(random((71400,))*7).astype(Int) t = timer() ; p=sort(r) ; t.stop() 16:27 hoel@seeve:hoel 2>python numtst.py 17.3.0 0.185 0.148 2.053 21.668
-- snip --
So the less different values are contained in the array the longer takes the sorting. Is this also the case with newer versions of Numeric (But this is Python 1.5.2)? Why is sorting of these arrays so slow?
Thanks
Berthold -- email: hoel@GermanLloyd.org ) tel. : +49 (40) 3 61 49 - 73 74 ( C[_] These opinions might be mine, but never those of my employer.
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net http://lists.sourceforge.net/lists/listinfo/numpy-discussion