[Python-Dev] Hash collision security issue (now public)
Mark Shannon
mark at hotpy.org
Thu Dec 29 12:25:03 CET 2011
Raymond Hettinger wrote:
> FWIW, Uncle Timmy considers the non-randomized hashes to be a virtue.
> It is believed that they give us better-than-random results for commonly
> encountered datasets. A change to randomized hashes would have a
> negative performance impact on those cases.
Tim Peter's analysis applies mainly to ints which would be unchanged.
A change to the hash function for strings would make no difference to
the performance of the dict, as the ordering of the hash values is
already quite different from the ordering of the strings for any string
of more than 3 characters.
>
> Also, randomizing the hash wreaks havoc on doctests, book examples
> not matching actual dict reprs, and on efforts by users to optimize
> the insertion order into dicts with frequent lookups.
The docs clearly state that the ordering of iteration over dicts is
arbitrary. Perhaps changing it once in a while might be a good thing :)
Cheers,
Mark.
>
>
> Raymond
>
>
>
>
>
> On Dec 28, 2011, at 5:28 PM, Michael Foord wrote:
>
>> Hello all,
>>
>> A paper (well, presentation) has been published highlighting security problems with the hashing algorithm (exploiting collisions) in many programming languages Python included:
>>
>> http://events.ccc.de/congress/2011/Fahrplan/attachments/2007_28C3_Effective_DoS_on_web_application_platforms.pdf
>>
>> Although it's a security issue I'm posting it here because it is now public and seems important.
>>
>> The issue they report can cause (for example) handling an http post to consume horrible amounts of cpu. For Python the figures they quoted:
>>
>> reasonable-sized attack strings only for 32 bits Plone has max. POST size of 1 MB
>> 7 minutes of CPU usage for a 1 MB request
>> ~20 kbits/s → keep one Core Duo core busy
>>
>> This was apparently reported to the security list, but hasn't been responded to beyond an acknowledgement on November 24th (the original report didn't make it onto the security list because it was held in a moderation queue).
>>
>> The same vulnerability was reported against various languages and web frameworks, and is already fixed in some of them.
>>
>> Their recommended fix is to randomize the hash function.
>>
>> All the best,
>>
>> Michael
>>
>>
>> --
>> http://www.voidspace.org.uk/
>>
>>
>> May you do good and not evil
>> May you find forgiveness for yourself and forgive others
>> May you share freely, never taking more than you give.
>> -- the sqlite blessing
>> http://www.sqlite.org/different.html
>>
>>
>>
>>
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/raymond.hettinger%40gmail.com
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/mark%40hotpy.org
More information about the Python-Dev
mailing list