[Python-Dev] Dictionary tuning

Raymond Hettinger python@rcn.com
Mon, 28 Apr 2003 19:50:14 -0400

[jack, master of the mac]
> You wouldn't have some created some handy tables of 'typical' dictionary
> usage, would you?  They would be interesting in general, but very nice
> for the PEPs doing dict optimizations for symbol tables in particular.

That path proved fruitless.  I studied the usage patterns in about
a dozen of my apps and found that there is no such thing as typical.
Instead there are many categories of dictionary usage.

* attribute/method look-up in many small dictionaries
* uniquification apps with many redundant stores and few lookups
* membership testing with few stores and many lookups into small or large dicts.
* database style lookups following Zipf's law for key access in large dicts.
* graph explorers that access a few keys frequently and then move
   onto another set of related nodes.
* global/builtin variable access following a failed search of locals.

Almost every dictionary tune-up that helped one app would end-up 
hurting another.  The only way to test the effectiveness of a change 
was to time a whole suite of applications.  The standard benchmarks
were useless in this regard.  Worse still, contrived test programs 
would not predict the results for real apps.  There were several
reasons for this:

* there is a special case for handling dicts that only have string keys
* real apps exhibit keys access patterns that pull the most frequently
   accessed entries into the cache.  this thwarted attempts to improve
   cache performance at the expense of more collisions.
* any small, fixed set of test keys may have atypical collision anomalies,
   non-representative access frequencies, or not be characteristic of other
   dicts with a slightly different number of keys.
* some sets of keys have non-random hash patterns but if you rely
   on this, it victimizes other sets of keys.
* the results are platform dependent (ratio of processor speed to memory
   speed; size of cache; size of cache a line; cache associativity; write-back
   vs. write-through; etc).

I had done some experiments that focused on symbol tables and
had some luck with sequential searches into a self-organizing list.
Using a list eliminated the holes and allowed more of the entries
to fit in a single cache line.  No placeholders were needed for
deleted entries and that saves a test in the search loop.  The
self-organizing property kept the most frequently accessed
entries at the head of the list.   Using a sequential search had
slightly less overhead than the hash table search pattern.  Except
for the builtin dictionary, most of the symbol tables in my apps
have only a handful of entries.

if-only-i-had-had-a-single-valid-dict-performance-predictor-ly yours,

Raymond Hettinger