[Python-Dev] Python3 regret about deleting list.sort(cmp=...)

Terry Reedy tjreedy at udel.edu
Sat Mar 12 23:09:39 CET 2011


On 3/12/2011 3:44 PM, Guido van Rossum wrote:
> I was just reminded that in Python 3, list.sort() and sorted() no
> longer support the cmp (comparator) function argument. The reason is
> that the key function argument is always better. But now I have a
> nagging doubt about this:
>
> I recently advised a Googler who was sorting a large dataset and
> running out of memory. My analysis of the situation was that he was
> sorting a huge list of short lines of the form "shortstring,integer"
> with a key function that returned a tuple of the form ("shortstring",
> integer).

I believe that if the integer field were padded with leading blanks as 
needed so that all are the same length, then no key would be needed.

ll = ['a,11111', 'ab,    3', 'a,    1', 'a,  111']
ll.sort()
print(ll)
 >>>
['a,    1', 'a,  111', 'a,11111', 'ab,    3']

If most ints are near the max len, this would add little space, and be 
even faster than with the key.

 > Using the key function argument, in addition to N short
> string objects, this creates N tuples of length 2, N more slightly
> shorter string objects, and N integer objects. (Not to count a
> parallel array of N more pointers.) Given the object overhead, this
> dramatically increased the memory usage. It so happens that in this
> particular Googler's situation, memory is constrained but CPU time is
> not, and it would be better to parse the strings over and over again
> in a comparator function.

Was 3.2 used? It has a patch that reduces the extra memory that might 
not be in the last 3.1 release.

> But in Python 3 this solution is no longer available. How bad is that?
> I'm not sure. But I'd like to at least get the issue out in the open.

This removal has been one of the more contentious issues about (not) 
using 3.x. I believe Raymond had been more involved in the defense of 
the decision than I. However, the discussion/complaint has mostly been 
about the relative difficulty of writing a key function versus a compare 
function.

-- 
Terry Jan Reedy



More information about the Python-Dev mailing list