[Python-Dev] Hash collision security issue (now public)
Jim Jewett
jimjjewett at gmail.com
Mon Jan 2 00:28:02 CET 2012
Steven D'Aprano (in
<http://mail.python.org/pipermail/python-dev/2011-December/115162.html>)
wrote:
> By compile-time, do you mean when the byte-code is compilated, i.e. just
> before runtime, rather than a switch when compiling the Python executable from
> source?
No. I really mean when the C code is initially compiled to produce an
python executable.
The only reason we're worrying about this is that an adversary may
force worst-case performance. If the python instance isn't a server,
or at least isn't exposed to untrusted clients, then even a single
extra "if" test is unjustified overhead. Adding overhead to every
string hash or every dict lookup is bad.
That said, adding some overhead (only) to dict lookups *that already
hit half a dozen consecutive collisions* probably is reasonable,
because that won't happen very often with normal data. (6 collisions
can't happen at all unless there are already at least 6 entries, so
small dicts are safe; with at least 1/3 of the slots empty, it should
happen only 1/729 for worst-size larger dicts.)
-jJ
More information about the Python-Dev
mailing list