[issue13703] Hash collision security issue

Wed Jan 25 21:23:40 CET 2012

Dave Malcolm <dmalcolm at redhat.com> added the comment:

I think you're right: it will stop matching it during lookup within such
a dict, since the dict will be using the secondary hash for "abc", but
hash() for the C instance.

It will still match outside of the dict, and within other dicts.

So yes, this would be a subtle semantic change when under attack.
Bother.

Having said that, note that within the typical attack scenarios (HTTP
headers, HTTP POST, XML-RPC, JSON), we have a pure-str dict (or
sometimes a pure-bytes dict).  Potentially I could come up with a patch
that only performs this change for such a case (pure-str is easier,
given that we already track this), which would avoid the semantic change
you identify, whilst still providing protection against a wide range of
attacks.

Is it worth me working on this?

> > > Also, the level of complication is far higher than in any other of the
> > > proposed approaches so far (I mean those with patches), which isn't
> > > really a good thing.
> > 
> > So would I.  I want something I can backport, though.
> 
> Well, your approach sounds like it subtly and unpredictably changes the
> behaviour of dicts when there are too many collisions, so I'm not sure
> it's a good idea to backport it, either.
> 
> If we don't want to backport full hash randomization, I think I much
> prefer raising a BaseException when there are too many collisions,
> rather than this kind of (excessively) sophisticated workaround. You
> *are* changing a fundamental datatype in a rather complicated way.

Well, each approach has pros and cons, and we've circled around between
hash randomization vs collision counting vs other approaches for several
weeks.  I've supplied patches for 3 different approaches.

Is this discussion likely to reach a conclusion soon?  Would it be
regarded as rude if I unilaterally ship something close to:
  backport-of-hash-randomization-to-2.7-dmalcolm-2012-01-23-001.patch
in RHEL/Fedora, so that my users have some protection they can enable if
they get attacked? (see http://bugs.python.org/msg151847).  If I do
this, I can post the patches here in case other distributors want to
apply them.

As for python.org, who is empowered to make a decision here?  How can we
move this forward?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13703>
_______________________________________