[Python-Dev] Hash collision security issue (now public)
Stephen J. Turnbull
stephen at xemacs.org
Mon Jan 2 15:32:19 CET 2012
Christian Heimes writes:
> Am 31.12.2011 13:03, schrieb Stephen J. Turnbull:
> > I don't know the implementation issues well enough to claim it is a
> > solution, but this hasn't been mentioned before AFAICS:
> >
> > While the dictionary probe has to start with a hash for backward
> > compatibility reasons, is there a reason the overflow strategy for
> > insertion has to be buckets containing lists? How about
> > double-hashing, etc?
>
> Python's dict implementation doesn't use bucket but open addressing (aka
> closed hashed table). The algorithm for conflict resolution doesn't use
> double hashing. Instead it takes the original and (in most cases) cached
> hash and perturbs the hash with a series of add, multiply and bit shift ops.
In an attack, this is still O(collisions) per probe (as any scheme
where the address of the nth collision is a function of only the
hash), where double hashing should be "roughly" O(1) (with double the
coefficient).
But that evidently imposes too large a performance burden on non-evil
users, so it's not worth thinking about whether "roughly O(1)" is
close enough to O(1) to deter or exhaust attackers. I withdraw the
suggestion.
More information about the Python-Dev
mailing list