[issue13703] Hash collision security issue

Alex Gaynor report at bugs.python.org
Fri Jan 27 00:22:32 CET 2012


Alex Gaynor <alex.gaynor at gmail.com> added the comment:

I'm sorry then, but I'm a little confused.  I think we pretty clearly
established earlier that requiring users to make changes anywhere they
stored user data would be dangerous, because these locations are often in
libraries or other places where the code creating and modifying the
dictionary has no idea it's user data in it.

The proposed AVL solution fails if it requires users to fundamentally
restructure their data depending on it's origin.

We have solution that is known to work in all cases: hash randomization.
 There were three discussed issues with it:

a) Code assuming a stable ordering to dictionaries
b) Code assuming hashes were stable across runs.
c) Code reimplementing the hashing algorithm of a core datatype that is now
randomized.

I don't think any of these are realistic issues the way "doesn't protect
all cases" is.  (a) was never a documented, or intended property, indeed it
breaks all the time, if you insert keys in the wrong order, use a different
platform, or anything else can change.  (b) For the same reasons code
relying on (b) only worked if you didn't change anything, and in practice
I'm convinced neither of these were common (if ever existed).  Finally (c),
while it's a concern, I've reviewed Django, SQLAlchemy, PyPy, and the
stdlib: there is only one place where compatibility with a core-hash is
attempted, decimal.Decimal.

In summary, I think the case against hash-randomization has been seriously
overstated, and in no way is more dangerous than having a solution that
fails to solve the problem comprehensively.  Further, I think it is
imperative that we reach a consensus on this quickly, as the only reason
this hasn't been widely exploited yet is the lack of availability of the
data, when it becomes available I firmly expect just about every high
profile Python site on the internet (of which there are many) to be
attacked.

On Thu, Jan 26, 2012 at 6:03 PM, Martin v. Löwis <report at bugs.python.org>wrote:

>
> Martin v. Löwis <martin at v.loewis.de> added the comment:
>
> > But using non-__builtin__.str objects (such as UserString) would expose
> the
> > user to an attack?
>
> Not necessarily: only if they use these strings as dictionary keys, and
> only
> if they do so in contexts where arbitrary user input is consumed. In these
> cases, users need to rewrite their code to replace the keys. Using
> dictionary
> wrappers (such as UserDict), this is possible using only local changes.
>
> ----------
>
> _______________________________________
> Python tracker <report at bugs.python.org>
> <http://bugs.python.org/issue13703>
> _______________________________________
>

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue13703>
_______________________________________


More information about the Python-bugs-list mailing list