[issue21592] Make statistics.median run in linear time

Dong-hee Na report at bugs.python.org
Mon May 7 09:25:03 EDT 2018


Dong-hee Na <donghee.na92 at gmail.com> added the comment:

For CPython, I agree with the opinion that finding a median value with Tim sort(C version) is faster than quickselect algorithm with pure python.
Or we need to implement quickselect by C API.

The advantage of implementing this method by quickselect with pure python will be for pypy or any other python compatible compiler.

Considering the role that CPython provides for the standard library to other compatible compilers, it is worth considering from a performance point of view with a compatible compiler. However, for CPython, we have to take a performance hit.

Here is the benchmark result with pypy3 on MacOS

N        sort     select7  select23 select47 select97 select   select2  PR6715
-------- -------- -------- -------- -------- -------- -------- -------  -------
    5000    0.000    0.003    0.000    0.000    0.000    0.001    0.001    0.001
   10000    0.000    0.001    0.001    0.001    0.001    0.000    0.000    0.001
   50000    0.002    0.005    0.004    0.002    0.002    0.000    0.000    0.000
  100000    0.005    0.008    0.004    0.004    0.005    0.001    0.001    0.001
  500000    0.027    0.021    0.016    0.019    0.021    0.004    0.003    0.003
 1000000    0.054    0.035    0.030    0.037    0.041    0.006    0.007    0.008
 2000000    0.104    0.069    0.063    0.074    0.084    0.013    0.013    0.015
 3000000    0.162    0.105    0.091    0.109    0.122    0.017    0.018    0.017
 4000000    0.219    0.137    0.122    0.148    0.167    0.024    0.016    0.024
 5000000    0.275    0.170    0.156    0.182    0.204    0.030    0.028    0.017
 6000000    0.336    0.210    0.181    0.220    0.247    0.030    0.039    0.040
 7000000    0.447    0.283    0.242    0.292    0.329    0.056    0.055    0.043
 8000000    0.503    0.309    0.248    0.313    0.338    0.051    0.049    0.052
 9000000    0.592    0.359    0.323    0.353    0.421    0.065    0.047    0.082
10000000    0.643    0.398    0.357    0.412    0.433    0.061    0.070    0.047
11000000    0.700    0.387    0.365    0.454    0.459    0.087    0.075    0.077

----------
nosy: +corona10

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue21592>
_______________________________________


More information about the Python-bugs-list mailing list