[Python-Dev] Status of the fix for the hash collision vulnerability

"Martin v. Löwis" martin at v.loewis.de
Sat Jan 14 16:12:19 CET 2012

Am 13.01.2012 18:08, schrieb Mark Dickinson:
> On Fri, Jan 13, 2012 at 2:57 AM, Guido van Rossum <guido at python.org> wrote:
>> How
>> pathological the data needs to be before the collision counter triggers? I'd
>> expect *very* pathological.
> How pathological do you consider the set
>    {1 << n for n in range(2000)}
> to be?  

I think this is not a counter-example for the proposed algorithm (at
least not in the way I think it should be implemented).

Those values may collide on the slot in the set, but they don't collide
on the actual hash value.

So in order to determine whether the collision limit is exceeded, we
shouldn't count colliding slots, but colliding hash values (which we
will all encounter during an insert).

> though admittedly only around 30 collisions per hash value.

I do consider the case of hashing integers with only one bit set
pathological. However, this can be overcome by factoring the magnitude
of the number into the hash as well.


More information about the Python-Dev mailing list