[Python-Dev] Python3 regret about deleting list.sort(cmp=...)
Terry Reedy
tjreedy at udel.edu
Sat Mar 12 23:09:39 CET 2011
On 3/12/2011 3:44 PM, Guido van Rossum wrote:
> I was just reminded that in Python 3, list.sort() and sorted() no
> longer support the cmp (comparator) function argument. The reason is
> that the key function argument is always better. But now I have a
> nagging doubt about this:
>
> I recently advised a Googler who was sorting a large dataset and
> running out of memory. My analysis of the situation was that he was
> sorting a huge list of short lines of the form "shortstring,integer"
> with a key function that returned a tuple of the form ("shortstring",
> integer).
I believe that if the integer field were padded with leading blanks as
needed so that all are the same length, then no key would be needed.
ll = ['a,11111', 'ab, 3', 'a, 1', 'a, 111']
ll.sort()
print(ll)
>>>
['a, 1', 'a, 111', 'a,11111', 'ab, 3']
If most ints are near the max len, this would add little space, and be
even faster than with the key.
> Using the key function argument, in addition to N short
> string objects, this creates N tuples of length 2, N more slightly
> shorter string objects, and N integer objects. (Not to count a
> parallel array of N more pointers.) Given the object overhead, this
> dramatically increased the memory usage. It so happens that in this
> particular Googler's situation, memory is constrained but CPU time is
> not, and it would be better to parse the strings over and over again
> in a comparator function.
Was 3.2 used? It has a patch that reduces the extra memory that might
not be in the last 3.1 release.
> But in Python 3 this solution is no longer available. How bad is that?
> I'm not sure. But I'd like to at least get the issue out in the open.
This removal has been one of the more contentious issues about (not)
using 3.x. I believe Raymond had been more involved in the defense of
the decision than I. However, the discussion/complaint has mostly been
about the relative difficulty of writing a key function versus a compare
function.
--
Terry Jan Reedy
More information about the Python-Dev
mailing list