[issue13703] Hash collision security issue

Marc-Andre Lemburg report at bugs.python.org
Thu Jan 19 16:11:55 CET 2012


Marc-Andre Lemburg <mal at egenix.com> added the comment:

STINNER Victor wrote:
> 
> I tried the collision counting with a low number of collisions:
> ... no false positives with a limit of 50 collisions ...

Thanks for running those tests. Looks like a limit lower than 1000
would already do just fine.

Some timings showing how long it would take to hit a limit:

# 100
python2.7 -m timeit -n 100 "dict((x*(2**64 - 1), 1) for x in xrange(1, 100))"
100 loops, best of 3: 297 usec per loop

# 250
python2.7 -m timeit -n 100 "dict((x*(2**64 - 1), 1) for x in xrange(1, 250))"
100 loops, best of 3: 1.46 msec per loop

# 500
python2.7 -m timeit -n 100 "dict((x*(2**64 - 1), 1) for x in xrange(1, 500))"
100 loops, best of 3: 5.73 msec per loop

# 750
python2.7 -m timeit -n 100 "dict((x*(2**64 - 1), 1) for x in xrange(1, 750))"
100 loops, best of 3: 12.7 msec per loop

# 1000
python2.7 -m timeit -n 100 "dict((x*(2**64 - 1), 1) for x in xrange(1, 1000))"
100 loops, best of 3: 22.4 msec per loop

These timings have to matched against the size of the payload
needed to trigger those limits.

In any case, the limit needs to be configurable like the hash seed
in the randomization patch.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13703>
_______________________________________


More information about the Python-bugs-list mailing list