
[Jeff Epler]
Is there at least a guarantee that the hashing algorithm won't change in a bugfix release?
Guido said "yes" weakly, but the issue hasn't come up in recent times. In the past we've changed the hash functions for at least strings, tuples, and floats, based on systematic weaknesses uncovered by real-life ordinary data. OTOH, there's a requirement that, for objects of types that can be used as dict keys, two objects that compare equal must deliver equal hash codes, so people creating (mostly) number-like or (not sure if anyone does this) string-like types have to duplicate the hash codes Python delivers for builtin numbers and strings that compare equal to objects of their types. For example, the author of a Rational class should arrange for hash(Rational(42, 1)) to deliver the same result as hash(42) == hash(42L) == hash(42.0) == hash(complex(42.0, 0.0)) Such code would break if we changed the int/long/float/complex hashes for inputs that compare equal to integers. Tedious exercise for the reader: find a set of bad datetime objects in 2.3 ("bad" in the sense of their hash codes colliding; on a box where hash() returns a 32-bit int, there must be collisions, since datetime objects have way more than 32 independent bits of state).