[Python-Dev] Hash collision security issue (now public)

Jim Jewett jimjjewett at gmail.com
Mon Jan 2 01:02:44 CET 2012

Paul McMillan in

> Guido van Rossum wrote:
>> Hm. I'm not sure I like the idea of extra arithmetic for every character
>> being hashed.

> the collision generator doesn't necessarily vary the length of the
> string. Additionally, if we don't vary based on all the letters in the
> string, an attacker can fix the characters that we do use and generate
> colliding strings around them.

If the new hash algorithm doesn't kick in before, say, 32 characters,
then most currently hashed strings will not be affected.  And if the
attacker has to add 32 characters to every key, it reduces the "this
can be done with only N bytes uploaded" risk.  (The same logic
would apply to even longer prefixes, except that an attacker might
more easily find short-enough strings that collide.)

> We could also consider a less computationally expensive operation
> than the modulo for calculating the lookup index, like simply truncating
> to the correct number of bits.

Given that the modulo is always 2^N, how is that different?


More information about the Python-Dev mailing list