On Thu, Dec 29, 2011 at 8:32 AM, Christian Heimes <lists@cheimes.de> wrote:
IMHO we don't have to alter the outcome of hash("some string"), hash(1)
and all other related types. We just need to reduce the change the an
attacker can produce collisions in the dict (and set?) code that looks
up the slot (PyDictEntry). How about adding the random value in
Object/dictobject.c:lookdict() and lookdict_str() (Python 2.x) /
lookdict_unicode() (Python 3.x)? With this approach the hash of all our
objects stay the same and just the dict code needs to be altered.

I don't understand how that helps a collision attack.  If you can still generate two strings with the same (pre-randomized) hash, what difference does it make that the dict adds a random number?  The post-randomized number will still be the same, no?

Or does this attack just rely on the hash *remainders* being the same?  If so, I can see how hashing the hash would help.  But since the attacker doesn't know the modulus, and it can change as the dictionary grows, I would expect the attack to require matching hashes, not just matching hash remainders...  unless I'm just completely off base here.