[Python-Dev] Re: Re: lists v. tuples

Tim Peters tim.one@comcast.net
Sun, 16 Mar 2003 17:32:15 -0500


[Guido]
> ...
> I wonder, what's the need for cmp()?  My hunch is that the main reason
> for cmp() is that it's specified in various APIs -- e.g. list.sort(),
> or FP hardware.  But don't those APIs usually specify cmp() because
> their designers (mistakenly) believed the three different outcomes
> were easy to compute together and it would simplify the API?

The three possible outcomes from lexicographic comparison are natural to
compute together, though (compare elements left to right until hitting the
first non-equal element compare).  I expect C's designers had string
comparison mostly in mind, and it's natural for lots of search structures to
need know which of the three outcomes obtains.  For example, probing a
vanilla binary search tree needs to stop when it hits a node with key equal
to the thing searched for, or move left or right when != obtains.

Long int comparison is a variant of lexicographic comparison, and this
problem shows up repeatedly in a multitude of guises:  you have postive long
ints x and y, and want to find the long int q closest to x/y.

    q, r = divmod(x, y)
    # round nearest/even
    if 2*r > q or (q & 1 and 2*r == q):
        q += 1

is more expensive than necessary when the "q & 1 and 2*r == q" part holds:
the "2*r > q" part had to compare 2*r to q all the way to the end to deduce
that > wasn't the answer, and then you do it all over again to deduce that
equality is the right answer.

    q, r = divmod(x, y)
    c = cmp(2*r, q)
    if c > 0 or (q & 1 and c == 0):
        q += 1

is faster, and Python's long_compare() doesn't do any more work than is
really needed by this algorithm.

So sometimes cmp() is exactly what you want.  OTOH, if Python never had it,
the efficiency gains in such cases probably aren't common enough that a
compelling case for adding it could have been made.