[Python-Dev] Hash collision security issue (now public)
Steven D'Aprano
steve at pearwood.info
Sat Dec 31 03:19:01 CET 2011
Jim Jewett wrote:
> My personal opinion is that accepting *and parsing* enough data for
> this to be a problem
> is enough of an edge case that I don't want normal dicts slowed down
> at all for this; I would
> therefore prefer that the change be restricted to such a compile-time
> switch, with current behavior the default.
By compile-time, do you mean when the byte-code is compilated, i.e. just
before runtime, rather than a switch when compiling the Python executable from
source? I will assume so.
I'm not a big fan of compile-time (runtime) switches. It makes it too hard to
compare before-and-after behaviour within a single session, and impossible to
have fine control over which objects have which behaviour. I don't like
all-or-nothing settings. (E.g. I'd love to be able to turn -O optimization on
and off on a per-function basis, but can't.)
How about using a similar strategy to the current dict behaviour with
__missing__ and defaultdict? Here's my suggestion:
- If a dict subclass defines __salt__, then it is called to salt the hash
value before lookups. If __salt__ is undefined or None, the current
behaviour remains unchanged.
- Add a dict subclass (saltdict, for lack of a better name) that defines
__salt__ appropriately to the collections module. In this case, I don't
know enough to suggest what is an appropriate salt. I leave that to the
security experts to argue about.
- Update the relevant standard library modules to use saltdict where needed.
This allows a single application or framework to use saltdict where necessary,
without slowing down all dict accesses. Dicts which never see user-generated
input (e.g. globals) can remain full-speed.
If there is no consensus about the best salting strategy, then apps can choose
their own by subclassing dict.
Responsibility for doing the right thing falls onto the library author, rather
than Python itself. Some people may consider that a minus.
--
Steven
More information about the Python-Dev
mailing list