Feature query: fetch top/bottom k from array
Morning, My apologies if this deviates from the vision of numpy: I find myself often requiring the indices and/or values of the top (or bottom) k items in a numpy array. I am aware of solutions involving partition/argpartition but these are inelegant. I am thinking of 1-dimensional arrays, but this concept extends to an arbitrary number of dimensions. Is this a feature that would benefit the numpy package? I am happy to code it up. Thanks for your time! Best regards Joe
Morning! I find myself often requiring the indices and/or values of the top (or bottom) k items in a numpy array. I am aware of solutions involving *partition*/*argpartition *but I find these inelegant (or using *sort *but these are inefficient). Is this a feature that would benefit the numpy package, or bloat it? I am happy to code it up. Here are some examples:
import numpy as np
a = np.array( [ [5,8,1,3,0], [5,6,2,1,3], [1,4,9,1,3], [8,0,4,7,0] ] )
# PROPOSED FEATURE: return (ordered) top 4 values in array:
a.top_k(k=4)
array([9, 8, 8, 7])
# CURRENT METHOD: return (ordered) top 4 values in array:
np.sort( np.partition(a.flatten(), -4)[-4:] )[::-1] # faster method
array([9, 8, 8, 7])
np.sort(a.flatten())[::-1][:4] # slower method
array([9, 8, 8, 7])
# PROPOSED FEATURE: return INDICES of (ordered) top 4 values in array:
a.top_k(k=4, return_indices=True)
array([12,1,15,18])
# CURRENT METHOD: return INDICES of (ordered) top 4 values in array:
(-a.flatten()).argsort()[:4]
array([12,1,15,18])
# PROPOSED FEATURE: multidimensional examples:
a.top_k(k=3, axis=0)
array( [8,5,1], [8,6,4], [9,4,2], [7,3,1], [3,3,0] )
a.top_k(k=3, axis=1)
array( [8,5,3], [6,5,2], [9,4,3], [8,7,4] ) I'd also consider including functionality for bottom k values, and methods for returning indices in the case of tied values (e.g. "first appearance", "random" etc.). Cheers Joe On Tue, 22 Feb 2022 at 15:30, Joseph Fox-Rabinovitz < jfoxrabinovitz@gmail.com> wrote:
Am Di., 22. Feb. 2022 um 14:25 Uhr schrieb Joseph Bolton <joseph.jazz.bolton@gmail.com>:
I find myself often requiring the indices and/or values of the top (or bottom) k items in a numpy array.
There has been discussion about this last year: https://mail.python.org/archives/list/numpy-discussion@python.org/thread/F4P... Mentioned in that thread is the following pull request, which has some more discussion: https://github.com/numpy/numpy/pull/19117 Friedrich
participants (4)
-
Brock Mendel
-
Friedrich Romstedt
-
Joseph Bolton
-
Joseph Fox-Rabinovitz