hashkey/digest for a complex object

kj no.email at please.post
Sat Oct 9 15:54:03 EDT 2010


In <87y6a9lqnj.fsf at gmail.com> Arnaud Delobelle <arnodel at gmail.com> writes:

>You could do something like this:

>deep_methods = {       
>    list: lambda f, l: tuple(map(f, l)),
>    dict: lambda f, d: frozenset((k, f(v)) for k, v in d.items()),
>    set: lambda f, s: frozenset(map(f, s)),
>    # Add more if needed
>    }

>def apply_method(f, obj):
>    try:
>        method = deep_methods[type(obj)]
>    except KeyError:
>        return obj
>    return method(f, obj)

>def deepfreeze(obj):
>    """Return a 'hashable version' of an object
>    return apply_method(deepfreeze, obj)

>def deephash(obj):
>    """Return hash(deepfreeze(obj)) without deepfreezing"""
>    return hash(apply_method(deephash, obj))

># Example of deepfreezable object:
>obj = [1, "foo", {(2, 4): {7, 5, 4}, "bar": "baz"}]
                           ^       ^
                           |       |
                           `-------`------- what's this?


>>>> deepfreeze(obj)
>(1, 'foo', frozenset({('bar', 'baz'), ((2, 4), frozenset({4, 5, 7}))}))
>>>> deephash(obj)
>1341422540
>>>> hash(deepfreeze(obj))
>1341422540


After fixing the missing """ in deepfreeze this code works as
advertised, but I'm mystified by the identity between hash(deepfreeze(...))
and deephash(...).  Without some knowledge of the Python internals,
I don't see how this follows.

More specifically, it is not obvious to me that, for example,

hash(frozenset((<whatever>,)))

would be identical to

hash(frozenset((hash(<whatever>),)))

but this identity has held every time I've checked it.  Similarly
for other more complicated variations on this theme.

Anyway, thanks for the code.  It's very useful.

~kj



More information about the Python-list mailing list