[issue13703] Hash collision security issue
report at bugs.python.org
Wed Jan 11 17:03:19 CET 2012
Marc-Andre Lemburg <mal at egenix.com> added the comment:
Antoine Pitrou wrote:
> Antoine Pitrou <pitrou at free.fr> added the comment:
>> OTOH, the collision counting patch is very simple, doesn't have
>> the performance issues and provides real protection against the
> I don't know about real protection: you can still slow down dict
> construction by 1000x (the number of allowed collisions per lookup),
> which can be enough combined with a brute-force DOS.
On my slow dev machine 1000 collisions run in around 22ms:
python2.7 -m timeit -n 100 "dict((x*(2**64 - 1), 1) for x in xrange(1, 1000))"
100 loops, best of 3: 22.4 msec per loop
Using this for a DOS attack would be rather noisy, much unlike
sending a single POST.
Note that the choice of 1000 as limit is rather arbitrary. I just
chose it because it's high enough because it's very unlikely to be
hit by an application that is not written to trigger it and it's low
enough to still provide a good run-time behavior. Perhaps an
even lower figure would be better.
> Also, how about false positives? Having legitimate programs break
> because of legitimate data would be a disaster.
Yes, which is why the patch should be disabled by default (using
an env var) in dot-releases. It's probably also a good idea to
make the limit configurable to adjust to ones needs.
Still, it is *very* unlikely that you run into real data causing
more than 1000 collisions for a single insert.
For full protection the universal hash method idea would have
to be implemented (adding a parameter to the hash methods, so
that they can be parametrized). This would then allow switching
the dict to an alternative hash implementation resolving the collision
problem, in case the implementation detects high number of
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list