[Python-Dev] Hash collision security issue (now public)

Christian Heimes lists at cheimes.de
Thu Dec 29 14:19:28 CET 2011


Am 29.12.2011 12:13, schrieb Mark Shannon:
> The attack relies on being able to predict the hash value for a given
> string. Randomising the string hash function is quite straightforward.
> There is no need to change the dictionary code.
> 
> A possible (*untested*) patch is attached. I'll leave it for those more 
> familiar with unicodeobject.c to do properly.

I'm worried that hash randomization of str is going to break 3rd party
software that rely on a stable hash across multiple Python instances.
Persistence layers like ZODB and cross interpreter communication
channels used by multiprocessing may (!) rely on the fact that the hash
of a string is fixed.

Perhaps the dict code is a better place for randomization. The code in
lookdict() and lookdict_unicode() could add a value to the hash. My
approach is less intrusive and also closes the attack vector for all
possible objects including str, byte, int and so on. I like also Armin's
idea of an optional hash randomization.

Christian


More information about the Python-Dev mailing list