[Python-Dev] The Dictionary Gem is polished!

Christian Tismer tismer@tismer.com
Mon, 18 Dec 2000 13:08:34 +0200


"M.-A. Lemburg" wrote:
> 
> > Here some results, dictionaries have 1000 entries:
> >
> > timing for strings              old=  5.097 new= 5.088
> > timing for bad integers (<<10)  old=101.540 new=12.610
> > timing for bad integers (<<16)  old=571.210 new=19.220
> 
> Even though I think concentrating on string keys would provide more
> performance boost for Python in general, I think you have a point
> there. +1 from here.
> 
> BTW, would changing the hash function on strings from the simple
> XOR scheme to something a little smarter help improve the performance
> too (e.g. most strings used in programming never use the 8-th
> bit) ?

Yes, it would. I spent the rest of last night to do more
accurate tests, also refined the implementation (using
longs for the shifts etc), and turned from timing over to
trip counting, i.e. a dict counts every round through the
re-hash. That showed two things:
- The bits used from the string hash are not well distributed
- using a "warmup wheel" on the hash to suck all bits in
  gives the same quality of hashes like random numbers.

I will publish some results later today.

> I also think that we could inline the string compare function
> in dictobject:lookdict_string to achieve even better performance.
> Currently it uses a function which doesn't trigger compiler
> inlining.

Sure!

ciao - chris

-- 
Christian Tismer             :^)   <mailto:tismer@tismer.com>
Mission Impossible 5oftware  :     Have a break! Take a ride on Python's
Kaunstr. 26                  :    *Starship* http://starship.python.net
14163 Berlin                 :     PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint       E182 71C7 1A9D 66E9 9D15  D3CC D4D7 93E2 1FAE F6DF
     where do you want to jump today?   http://www.stackless.com