[issue13703] Hash collision security issue
report at bugs.python.org
Fri Jan 20 05:58:37 CET 2012
Frank Sievertsen <python at sievertsen.de> added the comment:
>> That's true. But without the suffix, I can pretty easy and efficient
>> guess the prefix by just seeing the result of a few well-chosen and
>> short repr(dict(X)). I suppose that's harder with the suffix.
> Since the hash function is known, it doesn't make things much
> harder. Without suffix you just need hash('') to find out what
> the prefix is. With suffix, two values are enough
This is obvious and absolutely correct!
But it's not what I talked about. I didn't talk about the result of
hash(X), but about the result of repr(dict([(str: val), (str:
val)....])), which is more likely to happen and not so trivial
(if you want to know more than the last 8 bits)
IMHO this problem shows that we can't advice dict() or set() for
(potential dangerous) user-supplied keys at the moment.
I prefer randomization because it fixes this problem. The
collision-counting->exception prevents a software from becoming slow,
but it doesn't make it work as expected.
Sure, you can catch the exception. But when you get the exception,
probably you wanted to add the items for a reason: Because you want
them to be in the dict and that's how your software works.
Imagine an irc-server using a dict to store the connected users, using
the nicknames as keys. Even if the irc-server catches the unexpected
exception while connecting a new user (when adding his/her name to the
dict), an attacker could connect 999 special-named users to prevent a
specific user from connecting in future.
Collision-counting->exception can make it possible to inhibit a
specific future add to the dict. The outcome is highly application
I think it fixes 95% of the attack-vectors, but not all and it adds a
few new risks. However, of course it's much better then doing nothing
to fix the problem.
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list