[issue13703] Hash collision security issue

Wed Jan 11 10:28:31 CET 2012

Marc-Andre Lemburg <mal at egenix.com> added the comment:

STINNER Victor wrote:
> 
> Patch version 5 fixes test_unicode for 64-bit system.

Victor, I don't think the randomization idea is going anywhere. The
code has many issues:

 * it is exceedingly complex
 * the method would need to be implemented for all hashable
   Python types
 * it causes startup time to increase (you need urandom data for
   every single hashable Python data type)
 * it causes run-time to increase due to changes in the hash
   algorithm (more operations in the tight loop)
 * causes different processes in a multi-process setup to use different
   hashes for the same object
 * doesn't appear to work well in embedded interpreters that
   regularly restarted interpreters (AFAIK, some objects persist across
   restarts and those will have wrong hash values in the newly started
   instances)

The most important issue, though, is that it doesn't really
protect Python against the attack - it only makes it less
likely that an adversary will find the init vector (or a way
around having to find it via crypt analysis).

OTOH, the collision counting patch is very simple, doesn't have
the performance issues and provides real protection against the
attack. Even better still, it can detect programming errors in
hash method implementations.

IMO, it would be better to put efforts into refining the collision
detection patch (perhaps adding support for the universal hash
method slot I mentioned) and run some real life tests with it.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13703>
_______________________________________