[issue21592] Make statistics.median run in linear time

Thomas Dybdahl Ahle report at bugs.python.org
Sat May 31 11:15:27 CEST 2014


Thomas Dybdahl Ahle added the comment:

I think "minimize expected-case time" is a good goal. If we wanted "minimize worst-case time" we would have to use k-means rather than quickselect.

My trials on random data, where sort arguably has a disadvantage, suggests sorting is about twice as fast for most input sizes. With pypy quick-select is easily 5-10 times faster, which I take as a suggestion that a C-implementation might be worth a try.

For designing a realistic test-suite, I suppose we need to look at what tasks medians are commonly used for. I'm thinking median filters from image processing, medians clustering, robust regressing, anything else?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue21592>
_______________________________________


More information about the Python-bugs-list mailing list