[Python-Dev] Python3 regret about deleting list.sort(cmp=...)
Guido van Rossum
guido at python.org
Sat Mar 12 21:44:46 CET 2011
I was just reminded that in Python 3, list.sort() and sorted() no
longer support the cmp (comparator) function argument. The reason is
that the key function argument is always better. But now I have a
nagging doubt about this:
I recently advised a Googler who was sorting a large dataset and
running out of memory. My analysis of the situation was that he was
sorting a huge list of short lines of the form "shortstring,integer"
with a key function that returned a tuple of the form ("shortstring",
integer). Using the key function argument, in addition to N short
string objects, this creates N tuples of length 2, N more slightly
shorter string objects, and N integer objects. (Not to count a
parallel array of N more pointers.) Given the object overhead, this
dramatically increased the memory usage. It so happens that in this
particular Googler's situation, memory is constrained but CPU time is
not, and it would be better to parse the strings over and over again
in a comparator function.
But in Python 3 this solution is no longer available. How bad is that?
I'm not sure. But I'd like to at least get the issue out in the open.
--Guido van Rossum (python.org/~guido)
More information about the Python-Dev