[Python-Dev] Hashing proposal: change only string-only dicts
martin at v.loewis.de
martin at v.loewis.de
Wed Jan 18 01:30:59 CET 2012
Zitat von Victor Stinner <victor.stinner at haypocalc.com>:
>> Each string would get two hashes: the "public" hash, which is constant
>> across runs and bugfix releases, and the dict-hash, which is only used
>> by the dictionary implementation, and only if all keys to the dict are
>> strings.
>
> The distinction between secret (private, secure) and "public" hash
> (deterministic) is not clear to me.
It's not about privacy or security. It's about compatibility. The
dict-hash is only used in the dict implementation, and never exposed,
leaving the tp_hash unmodified.
> Example: collections.UserDict implements __hash__() using
> hash(self.data).
Are you sure? I only see that used for UserString, not UserDict.
> collections.abc.Set computes its hash using hash(x) of each item. Same
> question.
The hash of the Set should most certainly use the element's tp_hash.
That *is* the hash of the objects, and it may collide for strings
just fine due to the vulnerability.
> If we need to use the secret hash, it should be exposed in Python.
It's not secret, just specific. I don't mind it being exposed. However,
that would be a new feature, which cannot be added in a security fix
or bug fix release.
> Which function/method would be used? I suppose that we cannot add
> anything to stable releases like 2.7.
Right. Nor do I see any need to expose it. It fixes the vulnerability
just fine without being exposed.
Regards,
Martin
More information about the Python-Dev
mailing list