[Python-Dev] Python3 regret about deleting list.sort(cmp=...)
Terry Reedy
tjreedy at udel.edu
Sun Mar 13 03:17:39 CET 2011
On 3/12/2011 8:28 PM, Steven D'Aprano wrote:
> Fredrik Johansson wrote:
>
>> Consider sorting a list of pairs representing fractions. This can be
>> done easily in Python 2.x with the comparison function lambda
>> (p,q),(r,s): cmp(p*s, q*r). In Python 2.6, this is about 40 times
>> faster than using fractions.Fraction as a key function.
>
>
> [steve at sylar ~]$ python2.7 -m timeit -s "L = [(1,2), (3,4), (0,5),
> (9,100), (3,7), (2,8)]" "sorted(L, lambda (p,q),(r,s): cmp(p*s, q*r))"
> 10000 loops, best of 3: 25.1 usec per loop
>
> [steve at sylar ~]$ python2.7 -m timeit -s "L = [(1,2), (3,4), (0,5),
> (9,100), (3,7), (2,8)]" -s "from fractions import Fraction" "sorted(L,
> key=lambda t: Fraction(*t))"
> 1000 loops, best of 3: 236 usec per loop
>
>
> So for a short list, I get a factor of ten difference. For a longer
> list, I'd expect the key function to win out. Much to my astonishment,
> it doesn't -- I get similar results regardless of the size of L.
>
> Size of L key/cmp
> ========== =========
> 6 9.4
> 600 13.9
> 60000 7.0
> 6000000 6.7
Interesting. Comparing two Fractions must do the same thing as the cmp
function, compare two products, but it does so with a *lot* more overhead:
571 def _richcmp(self, other, op):
572 """Helper for comparison operators, for internal use
574 Implement comparison between a Rational instance and
575 either another Rational instance or a float `other`. If
576 `other` is not a Rational instance or a float, return
577 NotImplemented. `op` should be one of the six standard
578 comparison operators.
580 """
581 # convert other to a Rational instance where reasonable.
582 if isinstance(other, numbers.Rational):
583 return op(self._numerator * other.denominator,
584 self._denominator * other.numerator)
585 if isinstance(other, float):
586 if math.isnan(other) or math.isinf(other):
587 return op(0.0, other)
588 else:
589 return op(self, self.from_float(other))
590 else:
591 return NotImplemented
592
593 def __lt__(a, b):
594 """a < b"""
595 return a._richcmp(b, operator.lt)
For this example, and any with suitably restricted values, key=float
would work and should beat the cmp version. But of course, someone will
have a use case for which that will not work. (But then, one could do a
near linear scan to properly re-sort slices with equal float keys.)
Hmm. As I remember, one rationale for deleting cmp is that nlogn slow
cmp() calls are slower than n slow key() calls, but that only matters if
the nlogn '<' comparisions of stored keys are fast compared to cmp calls
(as tends to be the case when the keys are builtin class instances).
But in this case, they are much slower. To be faster, one would need
something like "key=lambda p,q:p*(lcm//q)", where lcm is the least
common multiple of of all the q's (denominators). For the example above,
lcm = 700. But for longer lists, it could be outrageously large.
--
Terry Jan Reedy
More information about the Python-Dev
mailing list