[Python-Dev] Keep default comparisons - or add a second set?

Wed Dec 28 14:29:03 CET 2005

And another example:

>>> a = 1+2j
>>> b = 2+1j
>>> a > b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: no ordering relation is defined for complex numbers

I came to think that, when forgetting backwards compatibility for a
while, the best thing for comparison operators to do is to raise a
TypeError by default, and work only for types that it makes sense to
compare. I think it is more "explicit is better than implicit", and I
have now two reasons for that:
1. Things like "Decimal(3.0) == 3.0" will make more sense (raise an
exception which explains that Decimals should not be compared to
floats, instead of returning false constantly).
2. It is more forward compatible - when it is discovered that two
types can sensibly be compared, the comparison can be defined, without
changing an existing behaviour which doesn't raise an exception.

Perhaps you can explain to me again why arbitrary objects should be
comparable? I don't see why sorting objects according to values should
work when the order has no real meaning. I don't see why you need all
objects to be comparable for storing them in containers with the
behaviour of dict or set.

If the reason is that you want containers that work among multiple
sessions, and are "identity-based" (that is, only one object can be
used as a key), you can attach to each object an id that isn't session
dependent, and use that instead of the default memory address.

It may be a reason for dropping the default "hash is id": suppose that
you want a persistent storage that will work like dicts but will not
be associated with one Python session (it may be exactly Zope's BTrees
- I don't know). Currently you have a problem with using __hash__ for
that purpose, since the hash value of an object can change between
sessions - that happens when it's the id of the object. Now suppose
that we have a "persistent" id function - it can be implemented by
using weakrefs to associate a unique value with an object on the first
time that the function is called, and storing it with the object when
serializing it. Also suppose that we drop the default hash method, so
where currently hash(x) is id(x), hash(x) will raise a TypeError. Then
our persistent storage can use the persistent id instead of the
default id, and it will work. (You might not want mutable objects to
be used as keys, but that's another problem - the data structure will
be consistent anyway.)

In fewer words, the built-in id() is just one way to assign identities
to objects. __hash__ shouldn't use it implicitly when there's no
value-based hash value - if it wouldn't, the rule that x == y ->
hash(x) == hash(y) will be preserved also between different sessions,
so persistent objects would be able to use hash values.

Does it make sense to you?

Noam