Mailman 3 Hash collision security issue (now public) - Python-Dev

Dec. 30, 2011

      In http://mail.python.org/pipermail/python-dev/2011-December/115138.html,
Christian Heimes
pointed out that
...
... we don't have to alter the outcome of hash ... We just need to reduce the chance that
an attacker can produce collisions in the dict (and set?)
I'll state it more strongly.  hash probably should not change (at
least for this), but we may
want to consider a different conflict resolution strategy when the
first slot is already filled.

Remember that there was a fair amount of thought and timing effort put
into selecting the
current strategy; it is deliberately sub-optimal for random input, in
order to do better with
typical input.      <
http://hg.python.org/cpython/file/7010fa9bd190/Objects/dictnotes.txt >

If there is a change, it would currently be needed in three places for
each of set and dict
(the lookdict functions and insertdict_clean).  It may be worth adding
some macros just to
keep those six in sync. Once those macros are in place, that allows a
compile-time switch.

My personal opinion is that accepting *and parsing* enough data for
this to be a problem
is enough of an edge case that I don't want normal dicts slowed down
at all for this; I would
therefore prefer that the change be restricted to such a compile-time
switch, with current
behavior the default.

http://hg.python.org/cpython/file/7010fa9bd190/Objects/dictobject.c#l571

   583    for (perturb = hash; ep->me_key != NULL; perturb >>= PERTURB_SHIFT) {
   584         i = (i << 2) + i + perturb + 1;

PERTURB_SHIFT is already a private #define to 5; per dictnotes, 4 and 6 perform
almost as well.  Someone worried can easily make that change today,
and be protected
from "generic" anti-python attacks.

I believe the salt suggestions have equivalent to replacing
perturb = hash;
with something like    perturb = hash + salt;

Changing     i = (i << 2) + i + perturb + 1;    would allow
effectively replacing the initial hash,
but risks spoiling performance in the non-adversary case.

Would there be objections to replacing those two lines with something like:

    for (perturb = FIRST_PERTURB(hash, key);
         ep->me_key != NULL;
         perturb = NEXT_PERTURB(hash, key, perturb)) {
        i = NEXT_SLOT(i, perturb);

The default macro definitions should keep things as they are

    #define FIRST_PERTURB(hash, key)    hash
    #define NEXT_PERTURB(hash, key, perturb)    perturb >> PERTURB_SHIFT
    #define NEXT_SLOT(i, perturb)    (i << 2) + i + perturb + 1

while allowing #ifdefs for (slower but) safer things like adding a
salt, or even using
alternative hashes.

-jJ

Hash collision security issue (now public)

Jim Jewett

Steven D'Aprano

Christian Heimes

Terry Reedy

Steven D'Aprano

Christian Heimes

Terry Reedy

tags

participants (4)