[Python-Dev] Hash collision security issue (now public)
PJ Eby
pje at telecommunity.com
Sat Dec 31 19:04:28 CET 2011
On Sat, Dec 31, 2011 at 7:03 AM, Stephen J. Turnbull <stephen at xemacs.org>wrote:
> While the dictionary probe has to start with a hash for backward
> compatibility reasons, is there a reason the overflow strategy for
> insertion has to be buckets containing lists? How about
> double-hashing, etc?
>
This won't help, because the keys still have the same hash value. ANYTHING
you do to them after they're generated will result in them still colliding.
The *only* thing that works is to change the hash function in such a way
that the strings end up with different hashes in the first place.
Otherwise, you'll still end up with (deliberate) collisions.
(Well, technically, you could use trees or some other O log n data
structure as a fallback once you have too many collisions, for some value
of "too many". Seems a bit wasteful for the purpose, though.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20111231/72554d42/attachment.html>
More information about the Python-Dev
mailing list