[Python-Dev] Hash collision security issue (now public)

Steven D'Aprano steve at pearwood.info
Sat Dec 31 03:19:01 CET 2011

Jim Jewett wrote:

> My personal opinion is that accepting *and parsing* enough data for
> this to be a problem
> is enough of an edge case that I don't want normal dicts slowed down
> at all for this; I would
> therefore prefer that the change be restricted to such a compile-time
> switch, with current behavior the default.

By compile-time, do you mean when the byte-code is compilated, i.e. just 
before runtime, rather than a switch when compiling the Python executable from 
source? I will assume so.

I'm not a big fan of compile-time (runtime) switches. It makes it too hard to 
compare before-and-after behaviour within a single session, and impossible to 
have fine control over which objects have which behaviour. I don't like 
all-or-nothing settings. (E.g. I'd love to be able to turn -O optimization on 
and off on a per-function basis, but can't.)

How about using a similar strategy to the current dict behaviour with 
__missing__ and defaultdict? Here's my suggestion:

- If a dict subclass defines __salt__, then it is called to salt the hash
   value before lookups. If __salt__ is undefined or None, the current
   behaviour remains unchanged.

- Add a dict subclass (saltdict, for lack of a better name) that defines
   __salt__ appropriately to the collections module. In this case, I don't
   know enough to suggest what is an appropriate salt. I leave that to the
   security experts to argue about.

- Update the relevant standard library modules to use saltdict where needed.

This allows a single application or framework to use saltdict where necessary, 
without slowing down all dict accesses. Dicts which never see user-generated 
input (e.g. globals) can remain full-speed.

If there is no consensus about the best salting strategy, then apps can choose 
their own by subclassing dict.

Responsibility for doing the right thing falls onto the library author, rather 
than Python itself. Some people may consider that a minus.


More information about the Python-Dev mailing list