[Python-Dev] Sorting

Tim Peters tim.one@comcast.net
Mon, 22 Jul 2002 02:05:14 -0400


One more piece of this puzzle.  It's possible that one of {samplesort,
timsort} would become unboundedly faster as the cost of comparisons
increased over that of Python floats (which all the timings I posted used).
Here's a program that would show this if so, using my local Python, where
lists have an .msort() method:

"""
class SlowCmp(object):
    __slots__ = ['val']

    def __init__(self, val):
        self.val = val

    def __lt__(self, other):
        for i in range(SLOW):
            i*i
        return self.val < other.val

def drive(n):
    from random import randrange
    from time import clock as now

    n10 = n * 10
    L = [SlowCmp(randrange(n10)) for i in xrange(n)]
    L2 = L[:]

    t1 = now()
    L.sort()
    t2 = now()
    L2.msort()
    t3 = now()
    return t2-t1, t3-t2

for SLOW in 1, 2, 4, 8, 16, 32, 64, 128:
    print "At SLOW value", SLOW
    for n in range(1000, 10001, 1000):
        ss, ms = drive(n)
        print "    %6d  %6.2f  %6.2f  %6.2f" % (
            n, ss, ms, 100.0*(ss - ms)/ms)
"""

Here's the tail end of the output, from which I conclude that the number pf
comparisons done on random inputs is virtually identical for the two
methods; times vary by a fraction of a percent both ways, with no apparent
pattern (note that time.clock() has better than microsecond resolution on
WIndows, so the times going into the % calculation have more digits than are
displayed here):

At SLOW value 32
      1000    0.22    0.22   -0.05
      2000    0.50    0.50    0.10
      3000    0.80    0.80   -0.64
      4000    1.11    1.10    0.71
      5000    1.44    1.45   -0.12
      6000    1.77    1.76    0.72
      7000    2.10    2.09    0.31
      8000    2.43    2.41    0.79
      9000    2.78    2.80   -0.58
     10000    3.13    3.13   -0.01
At SLOW value 64
      1000    0.37    0.38   -1.00
      2000    0.83    0.83    0.20
      3000    1.33    1.33   -0.15
      4000    1.84    1.84    0.05
      5000    2.40    2.39    0.38
      6000    2.95    2.92    0.97
      7000    3.46    3.47   -0.20
      8000    4.04    4.01    0.87
      9000    4.60    4.63   -0.68
     10000    5.19    5.21   -0.33
At SLOW value 128
      1000    0.68    0.67    0.37
      2000    1.52    1.50    0.99
      3000    2.40    2.41   -0.67
      4000    3.35    3.32    1.03
      5000    4.30    4.32   -0.47
      6000    5.32    5.29    0.54
      7000    6.27    6.27    0.04
      8000    7.29    7.25    0.55
      9000    8.37    8.37   -0.03
     10000    9.39    9.43   -0.49