Q: sort's key and cmp parameters

Steven D'Aprano steven at REMOVE.THIS.cybersource.com.au
Tue Oct 6 19:41:14 EDT 2009


On Tue, 06 Oct 2009 12:28:00 -0700, Raymond Hettinger wrote:

> [bearophile]
>> I love the 'key', it makes my code simpler and it's simpler to
>> understand. I am not opposed to the idea of keeping cmp, that in some
>> rare cases may be useful or more natural.
>>
>> The problem is that if you allow to use the cmp, lot of programmers
>> will use it straight away, not even bothering to know what that strange
>> 'key' argument may be useful for. And they miss the how much handy
>> 'key' is.
> ...
>> So a funny solution to this situation may be the following: to keep
>> only 'key' in the sort/sorted for few years, so the community of Python
>> programmers gets used to 'key' only, and then put 'cmp' back in, for
>> the special cases ;-)
> 
> FWIW, sorting is only a small part of the picture.  The notion of
> three-way cmp functions was removed throughout the language.  The
> built-in cmp() function and the __cmp__ magic method and the
> corresponding C slot were all excised.  None of them played nicely with
> rich comparisons.  It was a case where having two-ways-to-do-it was
> complicating everyone's lives.  AFAICT, very few people understood
> exactly how the two interacted -- the under-the-hood C code for it was
> complex, unattractive, and hard to understand.

There's no reason why a hypothetical comparison function for sort() needs 
to be a three-way comparison function. All you need to implement sorting 
is a function to implement less-than -- I believe that sort() and min() 
are already implemented in such a way as to only require the objects 
implements __lt__.

Hypothetically, sort(), min() and heapq.nsmallest() could take an 
optional less-than comparison function, and max() and heapq.nlargest() 
could use a greater-than comparison function.

Of course you can do this now with a cost of O(N) and a helper class:

class SortHelper:
    def __init__(self, x):
        self.x = x
    def __lt__(self, other):
        return self.x < other.x + 42  # or whatever...

# Decorate, Sort, Undecorate.
mylist = map(SortHelper, mylist)
mylist.sort()
mylist = [obj.x for obj in mylist]


If your list is huge, you can do the decoration in place:

for i, x in enumerate(mylist):
    mylist[i] = SortHelper(x)
mylist.sort()
for i, obj in enumerate(mylist):
    mylist[i] = obj.x



but since this is going to especially useful for really big lists, paying 
that extra O(N) cost twice will hurt.



-- 
Steven



More information about the Python-list mailing list