[Python-Dev] Sorting
Tim Peters
tim.one@comcast.net
Mon, 22 Jul 2002 02:05:14 -0400
One more piece of this puzzle. It's possible that one of {samplesort,
timsort} would become unboundedly faster as the cost of comparisons
increased over that of Python floats (which all the timings I posted used).
Here's a program that would show this if so, using my local Python, where
lists have an .msort() method:
"""
class SlowCmp(object):
__slots__ = ['val']
def __init__(self, val):
self.val = val
def __lt__(self, other):
for i in range(SLOW):
i*i
return self.val < other.val
def drive(n):
from random import randrange
from time import clock as now
n10 = n * 10
L = [SlowCmp(randrange(n10)) for i in xrange(n)]
L2 = L[:]
t1 = now()
L.sort()
t2 = now()
L2.msort()
t3 = now()
return t2-t1, t3-t2
for SLOW in 1, 2, 4, 8, 16, 32, 64, 128:
print "At SLOW value", SLOW
for n in range(1000, 10001, 1000):
ss, ms = drive(n)
print " %6d %6.2f %6.2f %6.2f" % (
n, ss, ms, 100.0*(ss - ms)/ms)
"""
Here's the tail end of the output, from which I conclude that the number pf
comparisons done on random inputs is virtually identical for the two
methods; times vary by a fraction of a percent both ways, with no apparent
pattern (note that time.clock() has better than microsecond resolution on
WIndows, so the times going into the % calculation have more digits than are
displayed here):
At SLOW value 32
1000 0.22 0.22 -0.05
2000 0.50 0.50 0.10
3000 0.80 0.80 -0.64
4000 1.11 1.10 0.71
5000 1.44 1.45 -0.12
6000 1.77 1.76 0.72
7000 2.10 2.09 0.31
8000 2.43 2.41 0.79
9000 2.78 2.80 -0.58
10000 3.13 3.13 -0.01
At SLOW value 64
1000 0.37 0.38 -1.00
2000 0.83 0.83 0.20
3000 1.33 1.33 -0.15
4000 1.84 1.84 0.05
5000 2.40 2.39 0.38
6000 2.95 2.92 0.97
7000 3.46 3.47 -0.20
8000 4.04 4.01 0.87
9000 4.60 4.63 -0.68
10000 5.19 5.21 -0.33
At SLOW value 128
1000 0.68 0.67 0.37
2000 1.52 1.50 0.99
3000 2.40 2.41 -0.67
4000 3.35 3.32 1.03
5000 4.30 4.32 -0.47
6000 5.32 5.29 0.54
7000 6.27 6.27 0.04
8000 7.29 7.25 0.55
9000 8.37 8.37 -0.03
10000 9.39 9.43 -0.49