
"M.-A. Lemburg" wrote:
Here some results, dictionaries have 1000 entries:
timing for strings old= 5.097 new= 5.088 timing for bad integers (<<10) old=101.540 new=12.610 timing for bad integers (<<16) old=571.210 new=19.220
Even though I think concentrating on string keys would provide more performance boost for Python in general, I think you have a point there. +1 from here.
BTW, would changing the hash function on strings from the simple XOR scheme to something a little smarter help improve the performance too (e.g. most strings used in programming never use the 8-th bit) ?
Yes, it would. I spent the rest of last night to do more accurate tests, also refined the implementation (using longs for the shifts etc), and turned from timing over to trip counting, i.e. a dict counts every round through the re-hash. That showed two things: - The bits used from the string hash are not well distributed - using a "warmup wheel" on the hash to suck all bits in gives the same quality of hashes like random numbers. I will publish some results later today.
I also think that we could inline the string compare function in dictobject:lookdict_string to achieve even better performance. Currently it uses a function which doesn't trigger compiler inlining.
Sure! ciao - chris -- Christian Tismer :^) <mailto:tismer@tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Kaunstr. 26 : *Starship* http://starship.python.net 14163 Berlin : PGP key -> http://wwwkeys.pgp.net PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF where do you want to jump today? http://www.stackless.com