[Numpy-discussion] algorithm for faster median calculation ?

Jerome Caron jerome_caron_astro at ymail.com
Tue Jan 15 14:31:25 EST 2013


Dear all,
I am new to the Numpy-discussion list.
I would like to follow up some possibly useful information about calculating median.
The message below was posted today on the AstroPy mailing list.
Kind regards
Jerome Caron

#----------------------------------------
I think the calculation of median values in Numpy is not optimal. I don't know if there are other libraries that do better?
On my machine I get these results:
>>> data = numpy.random.rand(5000,5000)
>>> t0=time.time();print numpy.ma.median(data);print time.time()-t0
0.499845739822
15.1949999332
>>> t0=time.time();print numpy.median(data);print time.time()-t0
0.499845739822
4.32100009918
>>> t0=time.time();print aspylib.astro.get_median(data);print time.time()-t0
[ 0.49984574]
0.90499997139
>>>

The median calculation in Aspylib is using C code from Nicolas Devillard (can be found here: http://ndevilla.free.fr/median/index.html) interfaced with ctypes.
It could be easily re-used for other, more official packages. I think the code also finds quantiles efficiently.
See: http://www.aspylib.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130115/08268d4f/attachment.html>


More information about the NumPy-Discussion mailing list