[Python-Dev] Python3 regret about deleting list.sort(cmp=...)

Terry Reedy tjreedy at udel.edu
Sun Mar 13 03:17:39 CET 2011


On 3/12/2011 8:28 PM, Steven D'Aprano wrote:
> Fredrik Johansson wrote:
>
>> Consider sorting a list of pairs representing fractions. This can be
>> done easily in Python 2.x with the comparison function lambda
>> (p,q),(r,s): cmp(p*s, q*r). In Python 2.6, this is about 40 times
>> faster than using fractions.Fraction as a key function.
>
>
> [steve at sylar ~]$ python2.7 -m timeit -s "L = [(1,2), (3,4), (0,5),
> (9,100), (3,7), (2,8)]" "sorted(L, lambda (p,q),(r,s): cmp(p*s, q*r))"
> 10000 loops, best of 3: 25.1 usec per loop
>
> [steve at sylar ~]$ python2.7 -m timeit -s "L = [(1,2), (3,4), (0,5),
> (9,100), (3,7), (2,8)]" -s "from fractions import Fraction" "sorted(L,
> key=lambda t: Fraction(*t))"
> 1000 loops, best of 3: 236 usec per loop
>
>
> So for a short list, I get a factor of ten difference. For a longer
> list, I'd expect the key function to win out. Much to my astonishment,
> it doesn't -- I get similar results regardless of the size of L.
>
> Size of L key/cmp
> ========== =========
> 6 9.4
> 600 13.9
> 60000 7.0
> 6000000 6.7

Interesting. Comparing two Fractions must do the same thing as the cmp 
function, compare two products, but it does so with a *lot* more overhead:

571     def _richcmp(self, other, op):
572         """Helper for comparison operators, for internal use
574         Implement comparison between a Rational instance and
575         either another Rational instance or a float `other`. If
576         `other` is not a Rational instance or a float, return
577         NotImplemented. `op` should be one of the six standard
578         comparison operators.
580         """
581         # convert other to a Rational instance where reasonable.
582         if isinstance(other, numbers.Rational):
583             return op(self._numerator * other.denominator,
584                       self._denominator * other.numerator)
585         if isinstance(other, float):
586             if math.isnan(other) or math.isinf(other):
587                 return op(0.0, other)
588             else:
589                 return op(self, self.from_float(other))
590         else:
591             return NotImplemented
592
593     def __lt__(a, b):
594         """a < b"""
595         return a._richcmp(b, operator.lt)

For this example, and any with suitably restricted values, key=float 
would work and should beat the cmp version. But of course, someone will 
have a use case for which that will not work. (But then, one could do a 
near linear scan to properly re-sort slices with equal float keys.)

Hmm. As I remember, one rationale for deleting cmp is that nlogn slow 
cmp() calls are slower than n slow key() calls, but that only matters if 
the nlogn '<' comparisions of stored keys are fast compared to cmp calls 
(as tends to be the case when the keys are builtin class instances).

But in this case, they are much slower. To be faster, one would need 
something like "key=lambda p,q:p*(lcm//q)", where lcm is the least 
common multiple of of all the q's (denominators). For the example above, 
lcm = 700. But for longer lists, it could be outrageously large.

-- 
Terry Jan Reedy



More information about the Python-Dev mailing list