[Python-Dev] timsort for jython

Fri, 02 Aug 2002 20:51:22 -0400

[Tim]
> Stability doesn't come free, and for all I know, in another 3 years a
> method will be discovered that's 3x faster but not stable.

[Aahz]
> You're pulling our legs, right?  I thought you said this version of
> mergesort was converging on the theoretical lower bound.

For # of comparisons done on randomly ordered data, yes, there's a hard
lower bound of lg(n!) comparisons, but the samplesort hybrid was close
enough to that too that there wouldn't have been much point to timsort.  For
various kinds of partially ordered data, the only catch-all hard lower bound
is n-1 comparisons (read timsort.txt attached to the patch on SF, or
Objects/listsort.txt in current CVS -- there's much more info in those than
in the text file I posted to Python-Dev).

Comparisons aren't the whole story, either, as ~sort showed dramatically in
the x-platform timings (see the patch).  I believe timsort is sometimes more
cache-friendly than the samplesort hybrid (&, e.g., I see no other way to
explain the wild ~sort x-platform behavior), but it's not doing anything
heroic for cache-friendliness.  The pending-run stack invariants
automatically implement what's called "tiling" in the literature, but that's
not the only cache trick it *could* play.

i'll-be-dead-before-the-sorting-story-ly y'rs  - tim