efficient idiomatic queue?

David Eppstein eppstein at ics.uci.edu
Thu Jan 24 01:22:56 CET 2002

In article <83it9s4rm2.fsf at panacea.canonical.org>,
 Kragen Sitaker <kragen at pobox.com> wrote:

> Even with hacks to make it less likely to hit its worst-case, though,
> quicksort's constant factor makes it about twice as fast as mergesort.

This is either using clever tricks to pick a good pivot, or assuming that 
comparisons are fast (probably not a good assumption in Python), right? 
Merge sort uses very close to the optimal number of comparisons, single 
randomly-chosen pivot quicksort is off by some constant factor.

It also depends on how your data was represented; several years ago when I 
needed code (in C++) to sort linked-list data I chose merge sort, since 
it's simple, for that representation it does not use any extra memory, and 
the lists I was sorting were small enough that I didn't have to worry about 
caching effects (as I think someone pointed out earlier, array-based 
versions of quick sort and merge sort can be much better than heap sort in 
their caching behavior).

> How does heapsort's constant factor compare with mergesort's?

Not as good, but it uses less memory.  The problem is, quicksort doesn't 
use much more memory than heapsort, and is I think usually much faster, so 
heapsort is not so often the best choice.

> Python's built-in sort() uses quicksort (using a linear congruential
> generator to pick pivot elements) to partition the array into small
> chunks, and then uses an algorithm that looks a lot like a bubble sort
> to sort the small chunks.  It has special cases for already-sorted
> data, all-the-same data, reverse-sorted data, and each of the above
> followed by some unsorted data.  This is the kind of thing you get
> when you aim for practicality.

Yes, theory and practice are not the same thing.

Sometimes (as here) the theoretical algorithms are nice and clean, but 
making them practical creates a mess.  Other times theoreticians use lots 
of complicated tricks to reduce the worst-case complexity while the better 
practical solutions are simpler...
David Eppstein       UC Irvine Dept. of Information & Computer Science
eppstein at ics.uci.edu http://www.ics.uci.edu/~eppstein/

More information about the Python-list mailing list