[issue5186] Reduce hash collisions for objects with no __hash__ method

Raymond Hettinger report at bugs.python.org
Thu Feb 12 03:08:47 CET 2009


Raymond Hettinger <rhettinger at users.sourceforge.net> added the comment:

[Antoine]
> Ok, updated patch:
> - uses a 4-bit rotate (not shift)
> - avoids comparing an unsigned long to -1
> - tries to streamline the win64 special path (but I can't test)

pointer_hash4.patch looks fine to me.  Still, I think it's worth
considering the simpler and faster:  x |= x>>4.  The latter doesn't
require any special-casing for various pointer sizes.  It just works.

[Adam]
> Adding an arbitrary set of OR, XOR, or add makes me uneasy;
> I know enough to do them wrong (reduce entropy), but not 
> enough to do them right.

It's easy enough to prove (just show that the function is reversible)
and easy enough to test:

   assert len(set(ids)) == len(set(map(f, set(ids)))) # for any large
group of ids

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5186>
_______________________________________


More information about the Python-bugs-list mailing list