[Numpy-discussion] argsort speed

Ondřej Čertík ondrej.certik at gmail.com
Mon Feb 24 15:13:01 EST 2014


On Fri, Feb 21, 2014 at 11:09 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
>
> On Fri, Feb 21, 2014 at 10:35 PM, Ondřej Čertík <ondrej.certik at gmail.com>
> wrote:
>>
>> On Mon, Feb 17, 2014 at 11:40 AM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> >
>> >
>> > On Mon, Feb 17, 2014 at 11:32 AM, Julian Taylor
>> > <jtaylor.debian at googlemail.com> wrote:
>> >>
>> >> On 17.02.2014 15:18, Francesc Alted wrote:
>> >> > On 2/17/14, 1:08 AM, josef.pktd at gmail.com wrote:
>> >> >> On Sun, Feb 16, 2014 at 6:12 PM, Daπid <davidmenhur at gmail.com>
>> >> >> wrote:
>> >> >>> On 16 February 2014 23:43, <josef.pktd at gmail.com> wrote:
>> >> >>>> What's the fastest argsort for a 1d array with around 28 Million
>> >> >>>> elements, roughly uniformly distributed, random order?
>> >> >>>
>> >> >>> On numpy latest version:
>> >> >>>
>> >> >>> for kind in ['quicksort', 'mergesort', 'heapsort']:
>> >> >>>      print kind
>> >> >>>      %timeit np.sort(data, kind=kind)
>> >> >>>      %timeit np.argsort(data, kind=kind)
>> >> >>>
>> >> >>>
>> >> >>> quicksort
>> >> >>> 1 loops, best of 3: 3.55 s per loop
>> >> >>> 1 loops, best of 3: 10.3 s per loop
>> >> >>> mergesort
>> >> >>> 1 loops, best of 3: 4.84 s per loop
>> >> >>> 1 loops, best of 3: 9.49 s per loop
>> >> >>> heapsort
>> >> >>> 1 loops, best of 3: 12.1 s per loop
>> >> >>> 1 loops, best of 3: 39.3 s per loop
>> >> >>>
>> >> >>>
>> >> >>> It looks quicksort is quicker sorting, but mergesort is marginally
>> >> >>> faster
>> >> >>> sorting args. The diference is slim, but upon repetition, it
>> >> >>> remains
>> >> >>> significant.
>> >> >>>
>> >> >>> Why is that? Probably part of the reason is what Eelco said, and
>> >> >>> part
>> >> >>> is
>> >> >>> that in sort comparison are done accessing the array elements
>> >> >>> directly, but
>> >> >>> in argsort you have to index the array, introducing some overhead.
>> >> >> Thanks, both.
>> >> >>
>> >> >> I also gain a second with mergesort.
>> >> >>
>> >> >> matlab would be nicer in my case, it returns both.
>> >> >> I still need to use the argsort to index into the array to also get
>> >> >> the sorted array.
>> >> >
>> >> > Many years ago I needed something similar, so I made some functions
>> >> > for
>> >> > sorting and argsorting in one single shot.  Maybe you want to reuse
>> >> > them.  Here it is an example of the C implementation:
>> >> >
>> >> > https://github.com/PyTables/PyTables/blob/develop/src/idx-opt.c#L619
>> >> >
>> >> > and here the Cython wrapper for all of them:
>> >> >
>> >> >
>> >> >
>> >> > https://github.com/PyTables/PyTables/blob/develop/tables/indexesextension.pyx#L129
>> >> >
>> >> > Francesc
>> >> >
>> >>
>> >> that doesn't really make a big difference if the data is randomly
>> >> distributed.
>> >> the sorting operation is normally much more expensive than latter
>> >> applying the indices:
>> >>
>> >> In [1]: d = np.arange(10000000)
>> >>
>> >> In [2]: np.random.shuffle(d)
>> >>
>> >> In [3]: %timeit  np.argsort(d)
>> >> 1 loops, best of 3: 1.99 s per loop
>> >>
>> >> In [4]: idx = np.argsort(d)
>> >>
>> >> In [5]: %timeit d[idx]
>> >> 1 loops, best of 3: 213 ms per loop
>> >>
>> >>
>> >>
>> >> But if your data is not random it can make a difference as even
>> >> quicksort can be a lot faster then.
>> >> timsort would be a nice addition to numpy, it performs very well for
>> >> partially sorted data. Unfortunately its quite complicated to
>> >> implement.
>> >
>> >
>> > Quicksort and shellsort gain speed by having simple inner loops. I have
>> > the
>> > impression that timsort is optimal when compares and memory access are
>> > expensive, but I haven't seen any benchmarks for native types in
>> > contiguous
>> > memory.
>>
>> I found some benchmarks for continuous memory here:
>>
>> https://github.com/swenson/sort/
>> https://github.com/gfx/cpp-TimSort
>>
>> The first one seems the best, it probably can be directly reused in numpy.
>> The only issue is that it only sorts the array, but does not provide
>> argsort.
>>
>
> I'm impressed by the heapsort time. Heapsort is the slowest of the numpy
> sorts. So either the heapsort implementation is better than ours or the
> other sort are worse ;)
>
> Partially sorted sequence are pretty common, so timsort might be a worthy
> addition. Last time I looked, JDK was using timsort for sorting objects, and
> quicksort for native types. Another sort is dual pivot quicksort that I've
> heard some good things about.
>
> Adding indirect sorts isn't so difficult once the basic sort is available.
> Since the memory access tends to be larger as it gets randomly accessed,
> timsort might be a good choice for that.

Indeed, I think one has to be very careful about these benchmarks, since
it highly depends on the structure of the arrays being sorted.

I've been looking into this a bit, since I need some fast algorithm in Fortran,
that returns indices that sort the array. So far I use quicksort, but
this Timsort
might perform better for partially sorted arrays, which is typically
my use case.

Ondrej



More information about the NumPy-Discussion mailing list