[Numpy-discussion] argsort speed

Francesc Alted francesc at continuum.io
Mon Feb 17 09:18:20 EST 2014


On 2/17/14, 1:08 AM, josef.pktd at gmail.com wrote:
> On Sun, Feb 16, 2014 at 6:12 PM, Daπid <davidmenhur at gmail.com> wrote:
>> On 16 February 2014 23:43, <josef.pktd at gmail.com> wrote:
>>> What's the fastest argsort for a 1d array with around 28 Million
>>> elements, roughly uniformly distributed, random order?
>>
>> On numpy latest version:
>>
>> for kind in ['quicksort', 'mergesort', 'heapsort']:
>>      print kind
>>      %timeit np.sort(data, kind=kind)
>>      %timeit np.argsort(data, kind=kind)
>>
>>
>> quicksort
>> 1 loops, best of 3: 3.55 s per loop
>> 1 loops, best of 3: 10.3 s per loop
>> mergesort
>> 1 loops, best of 3: 4.84 s per loop
>> 1 loops, best of 3: 9.49 s per loop
>> heapsort
>> 1 loops, best of 3: 12.1 s per loop
>> 1 loops, best of 3: 39.3 s per loop
>>
>>
>> It looks quicksort is quicker sorting, but mergesort is marginally faster
>> sorting args. The diference is slim, but upon repetition, it remains
>> significant.
>>
>> Why is that? Probably part of the reason is what Eelco said, and part is
>> that in sort comparison are done accessing the array elements directly, but
>> in argsort you have to index the array, introducing some overhead.
> Thanks, both.
>
> I also gain a second with mergesort.
>
> matlab would be nicer in my case, it returns both.
> I still need to use the argsort to index into the array to also get
> the sorted array.

Many years ago I needed something similar, so I made some functions for 
sorting and argsorting in one single shot.  Maybe you want to reuse 
them.  Here it is an example of the C implementation:

https://github.com/PyTables/PyTables/blob/develop/src/idx-opt.c#L619

and here the Cython wrapper for all of them:

https://github.com/PyTables/PyTables/blob/develop/tables/indexesextension.pyx#L129

Francesc

>
> Josef
>
>
>> I seem unable to find the code for ndarray.sort, so I can't check. I have
>> tried to grep it tring all possible combinations of "def ndarray",
>> "self.sort", etc. Where is it?
>>
>>
>> /David.
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 
Francesc Alted




More information about the NumPy-Discussion mailing list