[Python-Dev] Counting collisions for the win

Frank Sievertsen pydev at sievertsen.de
Mon Jan 23 19:58:25 CET 2012



On 23.01.2012 19:25, Glenn Linderman wrote:
> So this sounds like SafeDict, but putting it under the covers and 
> automatically converting from dict to SafeDict after a collision 
> threshold has been reached.  Let's call it fallback-dict.
>
> and costs:
>
> * converting the dict from one hash to the other by rehashing all the 
> keys.

That's not exactly what it does, it calls the randomized hash-function 
only for those
keys, that that didn't find a free slot after 20 collision. And it uses 
this value only for
the further collision resolution.

So the value of hash() is used for the first 20 slots, randomized_hash() 
is used
after that.

1st try: slot[i = perturb = hash];
2nd try: slot[i=i * 5 + 1 + (perturb >>= 5)]
3rd try: slot[i=i * 5 + 1 + (perturb >>= 5)]
....
20th try: slot[i= i * 5 + 1 + (perturb >>= 5)]
21th try: slot[i= perturb = randomized_hash(key)] <---- HERE
22th try: slot[i= i * 5 + 1 + (perturb >>= 5)]

This is also why there is no conversion needed. It's a
per-key/per-lookup rule.

Frank
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20120123/d38884d5/attachment.html>


More information about the Python-Dev mailing list