[Python-Dev] Python3 regret about deleting list.sort(cmp=...)

Peter Otten __peter__ at web.de
Sat Mar 12 23:43:24 CET 2011


Guido van Rossum wrote:

> I was just reminded that in Python 3, list.sort() and sorted() no
> longer support the cmp (comparator) function argument. The reason is
> that the key function argument is always better. But now I have a
> nagging doubt about this:
> 
> I recently advised a Googler who was sorting a large dataset and
> running out of memory. My analysis of the situation was that he was
> sorting a huge list of short lines of the form "shortstring,integer"
> with a key function that returned a tuple of the form ("shortstring",
> integer). Using the key function argument, in addition to N short
> string objects, this creates N tuples of length 2, N more slightly
> shorter string objects, and N integer objects. (Not to count a
> parallel array of N more pointers.) Given the object overhead, this
> dramatically increased the memory usage. It so happens that in this
> particular Googler's situation, memory is constrained but CPU time is
> not, and it would be better to parse the strings over and over again
> in a comparator function.
> 
> But in Python 3 this solution is no longer available. How bad is that?
> I'm not sure. But I'd like to at least get the issue out in the open.

While there are other arguments to reintroduce cmp (or less_than instead?) 
the memory problem could also be addressed with a dont_cache_keys flag or 
max_cache_keys limit.

Peter



More information about the Python-Dev mailing list