[Python-Dev] The Dictionary Gem is polished!
Christian Tismer
tismer@tismer.com
Tue, 19 Dec 2000 17:48:58 +0200
Tim Peters wrote:
>
> Something else to ponder: my tests show that the current ("old") algorithm
> performs much better (somewhat worse than "new2" == new algorithm + warmup)
> if incr is simply initialized like so instead:
>
> if mp.oldalg:
> incr = (_hash & 0xffffffffL) % (mp.ma_size - 1)
Sure. I did this as well, but didn't consider a division
since it said to be too slow. But this is very platform
dependant. On Pentiums this might be not noticeable.
> That's another way to get all the bits to contribute to the result. Note
> that a mod by size-1 is analogous to "casting out nines" in decimal: it's
> the same as breaking hash into fixed-sized pieces from the right (10 bits
> each if size=2**10, etc), adding the pieces together, and repeating that
> process until only one piece remains. IOW, it's a degenerate form of
> division, but works well all the same. It didn't improve over that when I
> tried a mod by the largest prime less than the table size (which suggests
> we're sucking all we can out of the *probe* sequence given a sometimes-poor
> starting index).
Again I tried this too. Instead of the largest near prime I used
the nearest prime. Remarkably the nearest prime is identical
to the primitive element in a lot of cases.
But no improvement over the modulus.
>
> However, it's subject to the same weak clustering phenomenon as the old
> method due to the ill-advised "~hash" operation in computing the initial
> index. If ~ is also thrown away, it's as good as new2 (here I've tossed out
> the "windows names", and "old" == existing algorithm except (a) get rid of ~
> when computing index and (b) do mod by size-1 when computing incr):
...
> The new and new2 values differ in minor ways from the ones you posted
> because I got rid of the ~ (the ~ has a bad interaction with "additive"
> means of computing incr, because the ~ tends to move the index in the
> opposite direction, and these moves in opposite directions tend to cancel
> out when computing incr+index the first time).
Remarkable.
> too-bad-mod-is-expensive!-ly y'rs - tim
Yes. The wheel is cheapest yet.
ciao - chris
--
Christian Tismer :^) <mailto:tismer@tismer.com>
Mission Impossible 5oftware : Have a break! Take a ride on Python's
Kaunstr. 26 : *Starship* http://starship.python.net
14163 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
where do you want to jump today? http://www.stackless.com