[Python-Dev] Why should the default hash(x) == id(x)?

Michael Chermside mcherm at mcherm.com
Wed Nov 2 18:39:44 CET 2005


Noam Raphael writes:
> Is there a reason why the default __hash__ method returns the id of the
objects?
>
> It is consistent with the default __eq__ behaviour, which is the same
> as "is", but:
>
> 1. It can easily become inconsistent, if someone implements __eq__ and
> doesn't implement __hash__.
> 2. It is confusing: even if someone doesn't implement __eq__, he may
> see that it is suitable as a key to a dict, and expect it to be found
> by other objects with the same "value".
> 3. If someone does want to associate values with objects, he can
> explicitly use id:
> dct[id(x)] = 3. This seems to better explain what he wants.

Your first criticism is valid... it's too bad that there isn't a magical
__hash__ function that automatically derived its behavior from __eq__.
To your second point, I would tell this user to read the requirements.
And your third point isn't a criticism, just an alternative.

But to answer your question, the reason that the default __hash__ returns
the ID in CPython is just that this works. In Jython, I belive that the
VM provides a native hash method, and __hash__ uses that instead of
returning ID. Actually, it not only works, it's also FAST (which is
important... many algorithms prefer that __hash__ being O(1)).

I can't imagine what you would propose instead. Keep in mind that the
requirements are that __hash__ must return a value which distinguishes
the object. So, for instance, two mutable objects with identical values
MUST (probably) return different __hash__ values as they are distinct
objects.

> This leads me to another question: why should the default __eq__
> method be the same as "is"?

Another excellent question. The answer is that this is the desired
behavior of the language. Two user-defined object references are
considered equal if and only if (1) they are two references to the
same object, or (2) the user who designed it has specified a way
to compare objects (implemented __eq__) and it returns a True value.

> Why not make the default __eq__ really compare the objects, that is,
> their dicts and their slot-members?

Short answer: not the desired behavior. Longer answer: there are
three common patterns in object design. There are "value" objects,
which should be considered equal if all fields are equal. There are
"identity" objects which are considered equal only when they are
the same object. And then there are (somewhat less common) "value"
objects in which a few fields don't count -- they may be used for
caching a pre-computed result for example. The default __eq__
behavior has to cater to one of these -- clearly either "value"
objects or "identity" objects. Guido chose to cater to "identity"
objects believing that they are actually more common in most
situations. A beneficial side-effect is that the default behavior
of __eq__ is QUITE simple to explain, and if the implementation is
easy to explain then it may be a good idea.

-- Michael Chermside



More information about the Python-Dev mailing list