[Python-Dev] Status of the fix for the hash collision vulnerability

Fri Jan 13 12:49:15 CET 2012

>> Unfortunately it requires only a few seconds to compute enough 32bit
>> collisions on one core with no precomputed data.
> Are you running the hash function "backward" to generate strings with
> the same value, or you are more trying something like brute forcing?

If you try it brute force to hit a specific target, you'll only find 
only one good string every 4 billion tries. That's why you first blow up 
your target:

You start backward from an arbitrary target-value. You brute force for 3 
characters, for example, this will give you 16 million intermediate 
values from which you know that they'll end up in your target-value.

Those 16 million values are a huge target for now brute-forcing forward: 
Every 256 tries you'll hit one of these values.

> And how do you get the hash secret? You need it to run an attack.

I don't know. This was meant as an answer to the quoted text "So I hope 
that computing collisions requires a lot of CPU time (is slow) to make 
the attack ineffective with today computers.".

What I wanted to say is: The security relies on the fact that the 
attacker can't guess the prefix, not that he can't precompute the values 
and it takes hours or days to compute the collisions. If the prefix 
leaks out of the application, then the rest is trivial and done in a few 
seconds. The suffix is not important for the collision-prevention, but 
it will probably make it much harder to guess the prefix.

I don't know an effective way to get the prefix either, (if the 
application doesn't leak full hash(X) values).

Frank