Sorting (was RE: A better self)

Tim Peters tim.one at comcast.net
Mon Aug 5 01:07:59 CEST 2002


[Terry Reedy, mines <http://www.python.org/sf/587076>]
> The attachments include timsort.txt.  Here is a quick summary:

Very nice!

> 1. View the sequence of N items to be sorted as, and convert it to, a
> sequence of K ascending runs (down runs are reversed).  To avoid the
> overhead of merging small runs, use binary insertsort to bring run
> lengths up to a minimum (depending on N) in the range [32,64].
> Consequence: a list sorted in ascending order or descending order
> without duplicates will be detected as one run and left as is or
> simply reversed.
>
> 2. For K >= 2, timsort(runs) = timmerge(timsort(runs[:j]),
> timesort(runs[j:])).  Note: the merge tree is constructed from the
> bottom up rather than from the top down as might be implied by the
> above description.

It's worth noting that there's *some* j for which that's true, but you can't
know which j until the algorithm ends.  Merging alternates with finding "the
next" run, to get benefit from that the run just found is still high in the
memory hierarchy.  It may or may not get merged into one or more preceding
runs immediately, depending on the unknowable-in-advance natural run
lengths, and that leaves j a mystery until the last run has been found.

> 3. Merge intelligently (including 'galloping') to minimize compares
> and moves and exploit nonrandom structure in the original list.

When you put it that way, I'm surprised the implementation took more than 30
lines <wink>.





More information about the Python-list mailing list