[Numpy-discussion] k maximal elements

Keith Goodman kwgoodman at gmail.com
Mon Jun 6 09:44:38 EDT 2011


On Sun, Jun 5, 2011 at 11:15 PM, Alex Ter-Sarkissov <ater1980 at gmail.com> wrote:
> I have a vector of positive integers length n. Is there a simple (i.e.
> without sorting/ranking) of 'pulling out' k larrgest (or smallest) values.
> Something like
>
> sum(x[sum(x,1)>(max(sum(x,1)+min(sum(x,1))))/2,])
>
> but smarter

You could use bottleneck [1]. It has a partial sort written in cython:

    >> a = np.arange(10)
    >> np.random.shuffle(a)
    >> a
       array([6, 3, 8, 5, 4, 2, 0, 1, 9, 7])
    >> import bottleneck as bn
    >> k = 3
    >> b = bn.partsort(a, a.size-k)
    >> b[-k:]
       array([8, 9, 7])

Or you could get the indices:

    >> idx = bn.argpartsort(a, a.size-k)
    >> idx[-k:]
       array([2, 8, 9])

[1] The partial sort will be in Bottleneck 0.5.0. There is a beta2 at
https://github.com/kwgoodman/bottleneck/downloads. Release version
will be posted at http://pypi.python.org/pypi/Bottleneck



More information about the NumPy-Discussion mailing list