hashkey/digest for a complex object

Robert Kern robert.kern at gmail.com
Wed Oct 6 15:31:59 EDT 2010


On 10/6/10 1:58 PM, kj wrote:
>
> The short version of this question is: where can I find the algorithm
> used by the tuple class's __hash__ method?

The function tuplehash() in Objects/tupleobject.c, predictably enough.

> Now, for the long version of this question, I'm working with some
> complext Python objects that I want to be able to compare for
> equality easily.
>
> These objects are non-mutable once they are created, so I would
> like to use a two-step comparison for equality, based on the
> assumption that I can compute (either at creation time, or as needed
> and memoized) a hashkey/digest for each object.  The test for
> equality of two of these objects would first compare their hashkeys.
> If they are different, the two objects are declared different; if
> they match, then a more stringent test for equality is performed.

The most straightforward way to implement __hash__ for a complicated object is 
to normalize its relevant data to a tuple of hashable objects and then call 
hash() on that tuple. A straightforward way to compare such objects is to 
calculate that very same normalized tuple for each one and compare those tuples. 
Cache it if necessary. Don't bother hashing to implement __eq__ unless if you 
are really optimizing for space and time.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco




More information about the Python-list mailing list