Guido rethinking removal of cmp from sort method
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Mon Mar 28 11:43:03 EDT 2011
On Mon, 28 Mar 2011 14:39:04 +0200, Antoon Pardon wrote:
> I tried to sort lists of 10000 elemens. Each element a tuple two items
> that was to be sorted first according to the first item in ascending
> order, then according to the second item in descending order. This was
> done on python 2.6 so I had to write my own cmp_to_key function.
>
> I then sorted these lists of 10000 elements a 1000 times with the
> following methods.
Thank you for spending the time to get some hard data, but I can't
replicate your results since you haven't shown your code. Rather than
attempt to guess what you did and duplicate it, I instead came up with my
own timing measurements. Results are shown here, my code follows below:
[steve at sylar ~]$ python2.7 sort_test.py
Comparison function: 12.1256039143
Key function: 3.51603388786
Double sort: 2.33165812492
cmp_to_key: 28.1129128933
By far the best result comes from two calls to sort. Not far off is the
key function solution. (Admittedly, coming up with a key function for non-
numeric data would be challenging.) The comparison function, and the
cmp_to_key solution, are clearly *much* slower.
I may do another test with larger lists, and leave it running overnight
and see what happens.
[code]
from random import random, shuffle
from operator import itemgetter
from timeit import Timer
from functools import cmp_to_key
def make_data(n):
x = range(n)*2
y = range(n)*2
shuffle(x)
shuffle(y)
data = zip(x, y)
assert len(data) == 2*n
return data
# Sort according to the first item in ascending order, then by second item
# in descending order:
def cmp_func(t1, t2):
return cmp(t1[0], t2[0]) or cmp(t2[1], t1[1])
def key_func(t):
return (t[0], -t[1])
def sorter(d):
d = sorted(d, key=itemgetter(1), reverse=True)
return sorted(d, key=itemgetter(0))
# Check that all sort methods give the same result.
data = make_data(20)
assert sorted(data, cmp=cmp_func) == sorted(data, key=key_func) == \
sorter(data) == sorted(data, key=cmp_to_key(cmp_func))
data = make_data(5000)
assert len(data) == 10000
setup = "from __main__ import cmp_to_key, cmp_func, key_func, sorter,
data"
t1 = Timer("sorted(data, cmp=cmp_func)", setup)
t2 = Timer("sorted(data, key=key_func)", setup)
t3 = Timer("sorter(data)", setup)
t4 = Timer("sorted(data, key=cmp_to_key(cmp_func))", setup)
print "Comparison function:",
print min(t1.repeat(number=100, repeat=7))
print "Key function:",
print min(t2.repeat(number=100, repeat=7))
print "Double sort:",
print min(t3.repeat(number=100, repeat=7))
print "cmp_to_key:",
print min(t4.repeat(number=100, repeat=7))
[end code]
--
Steven
More information about the Python-list
mailing list