[Python-Dev] Sorting
Tim Peters
tim.one@comcast.net
Sat, 20 Jul 2002 23:26:44 -0400
Quick update. I left off here:
samplesort
i 2**i *sort \sort /sort 3sort ~sort =sort !sort
15 32768 0.13 0.01 0.01 0.10 0.04 0.01 0.11
16 65536 0.24 0.02 0.02 0.23 0.08 0.02 0.24
17 131072 0.54 0.05 0.04 0.49 0.18 0.04 0.53
18 262144 1.18 0.09 0.09 1.08 0.37 0.09 1.16
19 524288 2.58 0.19 0.18 2.34 0.76 0.17 2.52
20 1048576 5.58 0.37 0.36 5.12 1.54 0.35 5.46
timsort
15 32768 0.16 0.01 0.02 0.05 0.14 0.01 0.02
16 65536 0.24 0.02 0.02 0.06 0.19 0.02 0.04
17 131072 0.55 0.04 0.04 0.13 0.42 0.04 0.09
18 262144 1.19 0.09 0.09 0.25 0.91 0.09 0.18
19 524288 2.60 0.18 0.18 0.46 1.97 0.18 0.37
20 1048576 5.61 0.37 0.35 1.00 4.26 0.35 0.74
With a lot of complication (albeit principled complication), timsort now
looks like
15 32768 0.14 0.01 0.01 0.04 0.10 0.01 0.02
16 65536 0.24 0.02 0.02 0.05 0.17 0.02 0.04
17 131072 0.54 0.05 0.04 0.13 0.38 0.04 0.09
18 262144 1.18 0.09 0.09 0.24 0.81 0.09 0.18
19 524288 2.57 0.18 0.18 0.46 1.77 0.18 0.37
20 1048576 5.55 0.37 0.35 0.99 3.81 0.35 0.74
on the same data (tiny improvements in *sort and 3sort, significant
improvement in ~sort, huge improvements for some patterns that aren't
touched by this test).
For contrast and a sanity check, I also implemented Edelkamp and Stiegeler's
"Next-to-m" refinement of weak heapsort. If you know what heapsort is, this
is weaker <wink>. In the last decade, Dutton had the bright idea that a
heap is stronger than you need for sorting: it's enough if you know only
that a parent node's value dominates the right child's values, and then
ensure that the root node has no left child. That implies the root node has
the maximum value in the (weak) heap. It doesn't matter what's in the left
child for the other nodes, provided only that they're weak heaps too. The
weaker requirements allow faster (but trickier) code for maintaining the
weak-heap invariant as sorting proceeds, and in particular it requires far
fewer element comparisons than a (strong)heap sort. Edelkamp and Stiegeler
complicated this algorithm in several ways to cut the comparisons even more.
I stopped at their first refinement, which does a worst-case number of
comparisons
N*k - 2**k + N - 2*k
where
k = ceiling(logbase2(N))
so that even the worst case is very good. They have other gimmicks to cut
it more (we're close to the theoretical limit here, so don't read too much
into "more"!), but the first refinement proved so far from being promising
that I dropped it:
weakheapsort
i 2**i *sort \sort /sort 3sort ~sort =sort !sort
15 32768 0.19 0.12 0.11 0.11 0.11 0.11 0.12
16 65536 0.31 0.26 0.23 0.23 0.24 0.23 0.26
17 131072 0.71 0.55 0.49 0.49 0.51 0.48 0.56
18 262144 1.59 1.15 1.03 1.04 1.08 1.02 1.19
19 524288 3.57 2.43 2.18 2.18 2.27 2.14 2.51
20 1048576 8.01 5.08 4.57 4.58 4.77 4.50 5.29
The number of compares isn't the problem with this. The problem appears to
be heapsort's poor cache behavior, leaping around via multiplying and
dividing indices by 2. This is exacerbated in weak heapsort because it also
requires allocating a bit vector, to attach a "which of my children should I
think of as being 'the right child'?" flag to each element, and that also
gets accessed in the same kinds of cache-hostile ways at the same time.
The samplesort and mergesort variants access memory sequentially.
What I haven't accounted for is why weakheapsort appears to get a major
benefit from *any* kind of regularity in the input -- *sort is always the
worst case on each line, and by far (note that this implementation does no
special-casing of any kind, so it must be an emergent property of the core
algorithm). If I were a researcher, I bet I could get a good paper out of
that <wink>.