[Python-Dev] Hash collision security issue (now public)
v+python at g.nevcal.com
Fri Jan 6 04:39:30 CET 2012
On 1/5/2012 4:10 PM, Nick Coghlan wrote:
> On Fri, Jan 6, 2012 at 8:15 AM, Serhiy Storchaka<storchaka at gmail.com> wrote:
>> 05.01.12 21:14, Glenn Linderman написав(ла):
>>> So, fixing the vulnerable packages could be a sufficient response,
>>> rather than changing the hash function. How to fix? Each of those
>>> above allocates and returns a dict. Simply have each of those allocate
>>> and return and wrapped dict, which has the following behaviors:
>>> i) during __init__, create a local, random, string.
>>> ii) for all key values, prepend the string, before passing it to the
>>> internal dict.
>> Good idea.
Thanks for the implementation, Serhiy. That is the sort of thing I had
in mind, indeed.
> Not a good idea - a lot of the 3rd party tests that depend on dict
> ordering are going to be using those modules anyway,
Stats? Didn't someone post a list of tests that fail when changing the
hash? Oh, those were stdlib tests, not 3rd party tests. I'm not sure
how to gather the stats, then, are you?
> so scattering our
> solution across half the standard library is needlessly creating
> additional work without really reducing the incompatibility problem.
Half the standard library? no one has cared to augment my list of
modules, but I have seen reference to JSON in addition to cgi and
urllib.parse. I think there are more than 6 modules in the standard
> If we're going to change anything, it may as well be the string
> hashing algorithm itself.
Changing the string hashing algorithm is known (or at least no one has
argued otherwise) to be a source of backward incompatibility that will
break programs. My proposal (and Serhiy's implementation, assuming it
works, or can be easily tweaked to work, I haven't reviewed it in detail
or attempted to test it) will only break programs that have vulnerabilities.
I failed to mention one other benefit of my proposal: every web request
would have a different random prefix, so attempting to gather info is
futile: the next request has a different random prefix, so different
strings would collide.
Indeed it is nice when we can be cheery even when arguing, for the most
part :) I've enjoyed reading the discussions in this forum because most
folks have respect for other people's opinions, even when they differ.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev