[Python-Dev] Hashing proposal: 64-bit hash

martin at v.loewis.de martin at v.loewis.de
Sat Jan 28 21:57:01 CET 2012

Zitat von Serhiy Storchaka <storchaka at gmail.com>:

> 27.01.12 23:08, Frank Sievertsen написав(ла):
>>> As already mentioned, the vulnerability of 64-bit Python rather
>>> theoretical and not practical. The size of the hash makes the attack
>>> is extremely unlikely.
>> Unfortunately this assumption is not correct. It works very good with
>> 64bit-hashing.
>> It's much harder to create (efficiently) 64-bit hash-collisions.
>> But I managed to do so and created strings with
>> a length of 16 (6-bit)-characters (a-z, A-Z, 0-9, _, .). Even
>> 14 characters would have been enough.
>> You need less than twice as many characters for the same effect as in
>> the 32bit-world.
> The point is not the length of the string, but the size of string  
> space for inspection. To search for a string with a specified 64-bit  
> hash to iterate over 2 ** 64 strings.

I think you entirely missed the point of Frank's message. Despite your
analysis that it shall not be possible, Frank has *actually* computed
colliding strings, most likely also for a specified hash value.

> Of course, to calculate the hash function to use secure, not  
> allowing "cut corners" and reduce computation time.

This issue wouldn't be that relevant if there wasn't a documented
algorithm to significantly reduce the number of tries you need to
make to produce a string with a desired hash value. My own implementation
would need 2**33 tries in the worst case (for a 64-bit hash value);
thanks to the birthday paradox, it's actually a significant chance
that the algorithm finds collisions even faster.


More information about the Python-Dev mailing list