hashkey/digest for a complex object
Arnaud Delobelle
arnodel at gmail.com
Thu Oct 7 16:00:16 EDT 2010
kj <no.email at please.post> writes:
> The short version of this question is: where can I find the algorithm
> used by the tuple class's __hash__ method?
>
> Now, for the long version of this question, I'm working with some
> complext Python objects that I want to be able to compare for
> equality easily.
>
> These objects are non-mutable once they are created, so I would
> like to use a two-step comparison for equality, based on the
> assumption that I can compute (either at creation time, or as needed
> and memoized) a hashkey/digest for each object. The test for
> equality of two of these objects would first compare their hashkeys.
> If they are different, the two objects are declared different; if
> they match, then a more stringent test for equality is performed.
>
> So the problem is to come up with a reasonable hashkey for each of
> these objects. The objects have two significant attributes, and
> two of these objects should be regarded as equal if these attributes
> are "the same" in both. The first attribute is a simple dictionary
> whose keys are integers and values are strings. The second attribute
> is more complicated. It is a tree structure, represented as a
> dictionary of dictionaries of dictionaries... until we get to the
> leaf elements, which are frozensets of strings. The keys at every
> level of this structure are strings. E.g. a simple example of such
> an attribute would look like:
>
> {'A': {'a': set(['1', '2', '3']),
> 'b': set(['4', '5'])},
> 'B': set(['6', '7', '8'])}
>
> I'm looking for a good algorithm for computing a hash key for
> something like this? (Basically I'm looking for a good way to
> combine hashkeys.)
You could do something like this:
deep_methods = {
list: lambda f, l: tuple(map(f, l)),
dict: lambda f, d: frozenset((k, f(v)) for k, v in d.items()),
set: lambda f, s: frozenset(map(f, s)),
# Add more if needed
}
def apply_method(f, obj):
try:
method = deep_methods[type(obj)]
except KeyError:
return obj
return method(f, obj)
def deepfreeze(obj):
"""Return a 'hashable version' of an object
return apply_method(deepfreeze, obj)
def deephash(obj):
"""Return hash(deepfreeze(obj)) without deepfreezing"""
return hash(apply_method(deephash, obj))
# Example of deepfreezable object:
obj = [1, "foo", {(2, 4): {7, 5, 4}, "bar": "baz"}]
>>> deepfreeze(obj)
(1, 'foo', frozenset({('bar', 'baz'), ((2, 4), frozenset({4, 5, 7}))}))
>>> deephash(obj)
1341422540
>>> hash(deepfreeze(obj))
1341422540
--
Arnaud
More information about the Python-list
mailing list