Determining number of dict key collisions in a dictionary
rogerb at rogerbinns.com
Tue Dec 2 20:38:49 CET 2008
-----BEGIN PGP SIGNED MESSAGE-----
python at bdurham.com wrote:
> Background: I'm working on a project using very large dictionaries (64
> bit Python) and question from my client is how effective is Python's
> default hash technique for our data set?
Python hash functions return a long which in a 64 bit process is 32 bits
on Windows and 64 bits on pretty much every other 64 bit environment.
> Their concern is based on the
> belief that Python's default dictionary hash scheme is optimized for 32
> bit vs. 64 bit environments and may not have anticipated the additional
> range of keys that can be generated in a 64 bit environment. Our keys
> are based on 20 to 44 byte ASCII (7-bit) alpha-numeric strings.
Why not have them look at the source code? It is well commented and
there is another file with various notes. Look at Objects/dictobject.c
A teaser comment for you:
Most hash schemes depend on having a "good" hash function, in
the sense of simulating randomness. Python doesn't.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
-----END PGP SIGNATURE-----
More information about the Python-list