On Mon, Aug 31, 2009 at 9:06 PM, Sturla Molden
We recently has a discussion regarding an optimization of NumPy's median to average O(n) complexity. After some searching, I found out there is a selection algorithm competitive in speed with Hoare's quick select. It has the advantage of being a lot simpler to implement. In plain Python:
Chad, you can continue to write quick select using NumPy's C quick sort in numpy/core/src/_sortmodule.c.src. When you are done, it might be about 10% faster than this. :-)
I was sick for a bit last week, so got stalled on my version, but I'll be working on it this weekend. I'm going for a more general partition function, that could have slightly more general use cases than just a median. Nevertheless, its good to see there could be several options, hopefully at least one of which can be put into numpy. By the way, as far as I can tell, the above algorithm is exactly the same idea as a non-recursive Hoare (ie. quicksort) selection: Do the partition, then only proceed to the sub-partition that must contain the nth element. My version is a bit more general, allowing partitioning on a range of elements rather than just one, but the concept is the same. The numpy quicksort already does non recursive sorting. I'd also like to, if possible, have a specialized 2D version, since image media filtering is one of my interests, and the C version works on 1D (raveled) arrays only. -C