On Fri, Jan 30, 2015 at 5:58 AM, Julian Taylor < jtaylor.debian@googlemail.com> wrote:
On 01/30/2015 02:57 AM, Neal Becker wrote:
It sounds like np.partition could be used to answer the question: give me the highest K elements in a vector.
Is this a correct interpretation? Something like partial sort, but returned elements are unsorted.
I could really make some use of this, but in my case it is a list of objects I need to sort on a particular key. Is this algorithm available in general python code (not specific to numpy arrays)?
This is a correct interpretation, it is a selection algorithm, (np.select was already taken and I don't like c++ name nth_element). It guarantees all elements after the kth element are larger/equal to the kth and everything before smaller/equal. It can also select multiple orders in one go, e.g. the 10 largest and the 10 smallest (used for e.g. np.percentile) Note you can create a partial sort by running selection first and then sorting only the sections you want sorted. Its not perfectly efficient as it won't share selection pivots with quickselect but its faster than a full sort.
Unfortunately python does not have something similar for objects. There is an issue for it though: http://bugs.python.org/issue21592
As a workaround you can generate a metric ndarray of your objects e.g. by selecting only your keys from the object that can be sorted with less than (ideally into a native numpy type like int/float) and then use argpartition to get the indices that would sort the original object array/list.
As of right now, np.partition ends up calling np.sort unless the dtype is one of the numeric types. But it is smoking fast compared to sorting when the real thing kicks in! Jaime
If that is actually going to be faster than just using pythons full sort would need to be tested. Also be aware of https://github.com/numpy/numpy/issues/5524 jaime has just found likely after when looking at your mail :) _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.