[Python-Dev] Documentation Error for __hash__

Matt Giuca matt.giuca at gmail.com
Fri Aug 29 14:47:48 CEST 2008


>
>> It's probably a good idea to implement __hash__ for objects that
>> implement comparisons, but it won't always work and it is certainly
>> not needed, unless you intend to use them as dictionary keys.
>>
>>
>>
>
>
> So you're suggesting that we document something like.
>
> Classes that represent mutable values and define equality methods are free
> to define __hash__ so long as you don't mind them being used incorrectly if
> treated as dictionary keys...
>
> Technically true, but not very helpful in my opinion... :-)


No, I think he was suggesting we document that if a class overrides __eq__,
it's a good idea to also implement __hash__, so it can be used as a
dictionary key.

However I have issues with this. First, he said:

"It's probably a good idea to implement __hash__ for objects that
implement comparisons, but it won't always work and it is certainly
not needed, unless you intend to use them as dictionary keys."

You can't say "certainly not needed unless you intend to use them as
dictionary keys", since if you are defining an object, you never know when
someone else will want to use them as a dict key (or in a set, mind!) So *if
possible*, it is a good idea to implement __hash__ if you are implementing
__eq__.

But also, it needs to be very clear that if you *should not* implement
__hash__ on a mutable object -- and it already is. So basically the docs
should suggest that it is a good idea to implement __hash__ if you are
implementing __eq__ on an immutable object.

HOWEVER,

There are two contradictory pieces of information in the docs.

a) "if it defines
__cmp__()<http://docs.python.org/dev/reference/datamodel.html#object.__cmp__>or
__eq__() <http://docs.python.org/dev/reference/datamodel.html#object.__eq__>but
not
__hash__()<http://docs.python.org/dev/reference/datamodel.html#object.__hash__>,
its instances will not be usable as dictionary keys."
versus
b) "User-defined classes have
__cmp__()<http://docs.python.org/dev/3.0/reference/datamodel.html#object.__cmp__>and
__hash__()<http://docs.python.org/dev/3.0/reference/datamodel.html#object.__hash__>methods
by default; with them, all objects compare unequal and
x.__hash__() returns id(x)."

Note that these statements are somewhat contradictory: if a class has a
__hash__ method by default (as b suggests), then it isn't possible to "not
have a __hash__" (as suggested by a).

In Python 2, statement (a) is true for old-style classes only, while
statement (b) is true for new style classes only. This distinction needs to
be made. (For old-style classes, it isn't the case that it has a __hash__
method by default - rather that the hash() function knows how to deal with
objects without a __hash__ method, by calling id()).

In Python 3, statement (a) is true always, while statement (b) is not (in
fact just the same as old-style classes are in Python 2). So the Python 3
docs can get away with being simpler (without having to handle that weird
case).

I just saw Marc-Andre's new email come in; I'll look at that now.

Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20080829/4ab07333/attachment-0001.htm>


More information about the Python-Dev mailing list