[Python-Dev] Hash collision security issue (now public)
Stephen J. Turnbull
stephen at xemacs.org
Sat Dec 31 13:03:22 CET 2011
Victor Stinner writes:
> Let's try to summarize this "vulnerability".
>
> The creation of a Python dictionary has a complexity of O(n) in most
> cases, but O(n^2) in the *worst* case. The attack tries to go into the
> worst case. It requires to compute a set of N keys having the same hash
> value (hash(key1) == hash(key2) == ... hash(keyN)). It only has to
> compute these keys once. It looks like it is now cheap enough in
> practice to compute this dataset for Python (and other languages).
I don't know the implementation issues well enough to claim it is a
solution, but this hasn't been mentioned before AFAICS:
While the dictionary probe has to start with a hash for backward
compatibility reasons, is there a reason the overflow strategy for
insertion has to be buckets containing lists? How about
double-hashing, etc?
More information about the Python-Dev
mailing list