[Python-Dev] Dictionary sparseness

Tim Peters tim.one@comcast.net
Sun, 04 May 2003 23:00:09 -0400


[Raymond Hettinger]
> ...
> P.S.  Also, I think it worthwhile to at least transform dictresize()
> into PyDict_Resize() so that C extensions will have some control.
> This would make it possible for us to add a single line making
> the builtin dictionary more sparse and providing a 75% first probe
> hit rate.

The dynamic hit rate is the one that counts, and, e.g., it's not going to
speed anything to remove the current lowest-8-but-not-lowest-9-bits
collision between 'ArithmeticError' and 'reload' (I've never seen the former
used, and the latter is expensive).  IOW, measuring the dynamic first-probe
hit rate is a prerequisite to selling this idea; a stronger prerequisite is
demonstrating actual before-and-after speedups.

I agree with Guido that giving people controls they're ill-equipped to
understand will do more harm than good.  Even when they manage to stumble
into a small speedup, that will often become counterproductive over time, as
the characteristics of their ever-growing app change, and the Speed Weenie
who got the 2% speedup left, or moved on to some other project.  Or somebody
corrects the option name from 'smalest' to 'smallest', and suddenly the only
dict entry that mattered doesn't collide anymore -- but the mystery knob
boosting the dict size "because it sped things up" forever more wastes half
the space for a reason nobody ever understood.  Or we change Python's string
hash to use addition instead of xor to merge in the next character (a change
that may actually help a bit -- addition is a littler better at scrambling
the bits).  Etc.

it's-python-it's-supposed-to-be-slow<wink>-ly y'rs  - tim