[Python-Dev] Keep default comparisons - or add a second set?

Mon Dec 19 19:51:12 CET 2005

PEP 3000 now suggests that dropping default comparison has become more
than an idle what-if.

Unfortunately, one very common use case of comparisons is to get a
canonical order.  If the order is sensible, all the better, but that
is not strictly required.  One of Python's selling points (especially
compared to Java) is that getting a canonical order "just works", even
if the objects being sorted are not carefully homogenized by hand. 
Python itself relies on this when comparing same-length dictionaries.

There are times when a specific comparison doesn't make sense (date vs
a datetime that it includes), but these are corner cases best handled
by the specific class that understands the specific requirements --
classes that already have to override the comparison operators anyhow.

Even the recently posted "get rid of default comparisons" use case is
really just an argument to make the canonical ordering work better. 
The problem Jim Fulton describes is that the (current default)
canonical order will change if objects are saved to a database and
then imported to a different session.  Removing default comparisons
wouldn't really help much; the errors would (sometimes) show up at
saving instead of (maybe) at loading, but the solution would still be
to handcode a default comparison for every single class individually.

I don't think anyone wants a smorgasbord of inconsistent error-prone
boilerplate code.  (X<Y but also Y<X, because they used different
default comparisons, so it matters which one gets asked.)  Even
careful coders will sometimes get burned by whether to sort based on
id(obj) or (obj.__class__, id(obj))

Guido has already rejected a standard mixin class.

Today's solution is to write the default comparison once and let
everything "inherit" it.

The persistence layers need an improved default comparison -- but they
can provide their own via wrapper objects, since they control the save
mechanism.  To be honest, this may be the only reasonable solution; if
objects are saved with their original id and they use that for
comparison, then you can continue a session, but you still have to
worry about having the id reused in a later session.  If ids are
timestamped, you need to worry about importing objects from multiple
original processes, or different machines, or ... and the comparison
gets slow, for something that won't provide any benefit to smaller
applications.

Go ahead and tweak the default comparisons; there is room to argue about
whether (1,2) < [3,4] < (5,6) should evaluate as true.  Go ahead and
include more information, if it doesn't slow down the common case; it
might help the simpler databases.  But please don't make that
expression throw an exception.  (Or, at the very least, promote a
*standard* way to say "just get me a canonical ordering of some sort"
-- which we can write today with "sorted()".)

---------------------
References:

Guido agrees to removing the default comparisons:

    http://www.python.org/peps/pep-3000.html
        """
        Comparisons other than == and != between disparate types will raise an
        exception unless explicitly supported by the type [6]
        """
    and the more recent reference at
        http://mail.python.org/pipermail/python-dev/2005-November/057938.html.

Jim Fulton says that the _current_ default breaks for persistence.
    http://mail.python.org/pipermail/python-dev/2005-November/057924.html

Guido rejects a comparison mixin
    http://mail.python.org/pipermail/python-dev/2005-November/057925.html