
Short course: the average number of probes needed when searching small dicts/sets can be reduced, in both successful ("found") and failing ("not found") cases. But I'm not going to pursue this. This is a brain dump for someone who's willing to endure the interminable pain of arguing about benchmarks ;-) Background: http://bugs.python.org/issue30671 raised some questions about how dict collisions are handled. While the analysis there didn't make sense to me, I wrote enough code to dig into it. As detailed in that bug report, the current implementation appeared to meet the theoretical performance of "uniform hashing", meaning there was no room left for improvement. However, that missed something: the simple expressions for expected probes under uniform hashing are upper bounds, and while they're excellent approximations for modest load factors in sizable tables, for small tables they're significantly overstated. For example, for a table with 5 items in 8 slots, the load factor is a = 5/8 = 0.625, and avg probes when found = log(1/(1-a))/a = 1.57 when not found = 1/(1-a) = 2.67 However, exact analysis gives 1.34 and 2.25 instead. The current code achieves the upper bounds, but not the exact values. As a sanity check, a painfully slow implementation of uniform hashing does achieve the exact values. Code for all this is attached, in a framework that allows you to easily plug in any probe sequence strategy. The current strategy is implemented by generator "current". There are also implementations of "linear" probing, "quadratic" probing, "pre28201" probing (the current strategy before bug 28201 repaired an error in its coding), "uniform" probing, and ... "double". The last is one form of "double hashing" that gets very close to "uniform". Its loop guts are significantly cheaper than the current scheme, just 1 addition and 1 mask. However, it requires a non-trivial modulus to get started, and that's expensive. Is there a cheaper way to get close to "uniform"? I don't know - this was just the best I came up with so far. Does it matter? See above ;-) If you want to pursue this, take these as given: 1. The first probe must be simply the last `nbits` bits of the hash code. The speed of the first probe is supremely important, that's the fastest possible first probe, and it guarantees no collisions at all for a dict indexed by a contiguous range of integers (an important use case). 2. The probe sequence must (at least eventually) depend on every bit in the hash code. Else it's waaay too easy to stumble into quadratic-time behavior for "bad" sets of keys, even by accident. Have fun :-)