[issue13703] Hash collision security issue

Marc-Andre Lemburg report at bugs.python.org
Mon Feb 6 22:41:04 CET 2012


Marc-Andre Lemburg <mal at egenix.com> added the comment:

Gregory P. Smith wrote:
> 
> Gregory P. Smith <greg at krypto.org> added the comment:
> 
>>
>>> The release managers have pronounced:
>>> http://mail.python.org/pipermail/python-dev/2012-January/115892.html
>>> Quoting that email:
>>>> 1. Simple hash randomization is the way to go. We think this has the
>>>> best chance of actually fixing the problem while being fairly
>>>> straightforward such that we're comfortable putting it in a stable
>>>> release.
>>>> 2. It will be off by default in stable releases and enabled by an
>>>> envar at runtime. This will prevent code breakage from dictionary
>>>> order changing as well as people depending on the hash stability.
>>
>> Right, but that doesn't contradict what I wrote about adding
>> env vars to fix a seed and optionally enable using a random
>> seed, or adding collision counting as extra protection for
>> cases that are not addressed by the hash seeding, such as
>> e.g. collisions caused by 3rd types or numbers.
> 
> We won't be back-porting anything more than the hash randomization for
> 2.6/2.7/3.1/3.2 but we are free to do more in 3.3 if someone can
> demonstrate it working well and a need for it.
> 
> For me, things like collision counting and tree based collision
> buckets when the types are all the same and known comparable make
> sense but are really sounding like a lot of additional complexity. I'd
> *like* to see active black-box design attack code produced that goes
> after something like a wsgi web app written in Python with hash
> randomization *enabled* to demonstrate the need before we accept
> additional protections like this  for 3.3+.

I posted several examples for the integer collision attack on this
ticket. The current randomization patch does not address this at all,
the collision counting patch does, which is why I think both are
needed.

Note that my comment was more about the desire to *not* recommend
using random hash seeds per default, but instead advocate using
a random but fixed seed, or at least document that using random
seeds that are set during interpreter startup will cause
problems with repeatability of application runs.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13703>
_______________________________________


More information about the Python-bugs-list mailing list