[Python-Dev] Dictionary tuning

Raymond Hettinger python@rcn.com
Mon, 28 Apr 2003 20:20:35 -0400


> What is the effect on peak memory usage over "average" programs?

Since the amortized growth is the same, the effect is Nil on average.
Special cases can be contrived with a specific number of records
where the memory use is doubled, but in general it is nearly unchanged
for average programs.

> This might be a worthwhile speedup on small dicts (up to a TBD 
> number of entries) but not worthwhile for large dicts.

Actually, it helps large dictionaries even more that small dictionaries.
Collisions in large dicts are resolved through other memory probes
which are almost certain not to be in the current cache line.

> However, to add this capability in would of course add more code
> to a very common code path (additional test for current size to 
> decide what factor to increase by).

Your intuition is exactly correct.  All experiments to special case
various sizes results in decreased performance because it added
a tiny amount to some of the most heavily exercised code in
python.  Further, it results in an unpredicable branch which is
also not a good thing.

[GvR]
> I tried the patch with my new favorite benchmark, startup time for
> Zope (which surely populates a lot of dicts :-).  It did give about
> 0.13 seconds speedup on a total around 3.5 seconds, or almost 4%
> speedup.

[Tim]
>Nice (in relative, not absolute terms). Do we have any numbers on 
> memory usage during and after that period?

I found out that timing dict performance was hard.
Capturing memory usage was harder.  Checking entry 
space,space plus unused space, calls to PyMalloc, and 
calls to the os malloc, only the last is important, but
it depends on all kinds of things that are not easily
controlled.


Tim Delaney

_______________________________________________