[Python-Dev] Status of the fix for the hash collision vulnerability

Guido van Rossum guido at python.org
Fri Jan 13 22:22:32 CET 2012


On Fri, Jan 13, 2012 at 10:13 AM, Mark Dickinson <dickinsm at gmail.com> wrote:

> On Fri, Jan 13, 2012 at 5:43 PM, Guido van Rossum <guido at python.org>
> wrote:
> >> How pathological do you consider the set
> >>
> >>   {1 << n for n in range(2000)}
> >>
> >> to be?  What about the set:
> >>
> >>   ieee754_powers_of_two = {2.0**n for n in range(-1074, 1024)}
> >>
> >> ?  The > 2000 elements of the latter set have only 61 distinct hash
> >> values on 64-bit machine, so there will be over 2000 total collisions
> >> involved in creating this set (though admittedly only around 30
> >> collisions per hash value).
> >
> > Hm... So how does the collision counting work for this case?
>
> Ah, my bad.  It looks like the ieee754_powers_of_two is safe---IIUC,
> it's the number of collisions involved in a single key-set operation
> that's limited.  So a dictionary with keys {1<<n for n in range(2000)}
> is fine, but a dictionary with keys  {1<<(61*n) for n in range(2000)}
> is not:
>
> >>> {1<<(n*61):True for n in range(2000)}
> Traceback (most recent call last):
>  File "<stdin>", line 1, in <module>
>  File "<stdin>", line 1, in <dictcomp>
> KeyError: 'too many hash collisions'
> [67961 refs]
>
> I'd still not consider this particularly pathological, though.


Really? Even though you came up with specifically to prove me wrong?

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20120113/5dea04fe/attachment.html>


More information about the Python-Dev mailing list