New subject: Dictionary tuning

April 28, 2003

      ...
From: Tim Peters [mailto:tim.one@comcast.net]
That doesn't make sense.  Dicts can be larger after the 
patch, but never
smaller, so there's nothing opposing the "can be larger" 
part:  on average,
allocated address space must be strictly larger than before.  
Whether that
*matters* on average to the average user is something we can answer
rigorously just as soon as we find an average user with an 
average program
<wink>.  I'm not inclined to worry much about it.
That's what I was getting at. I know that (for example) most
classes I create have less that 16 entries in their __dict__.
With this change, each class instance would take (approx) twice
as much memory for its __dict__. I suspect that class instance
__dict__ is the most common dictionary I use.
...
...
...
This might be a worthwhile speedup on small dicts (up to a TBD
number of entries) but not worthwhile for large dicts.
...
Actually, it helps large dictionaries even more that small 
dictionaries.
Collisions in large dicts are resolved through other memory probes
which are almost certain not to be in the current cache line.
That part makes sense.  Resizing a large dict is an expensive 
operation too.
That's not what I meant. Most dictionaries are fairly small.
Large dictionaries are common, but I doubt they are common enough
to offset the potential memory loss from this patch. Currently if
you go one over a threshold you have a capacity of 2*len(d)-1.
With the patch this would change to 4*len(d)-1 - very significant
for large dictionaries. Thus my consideration that it might be
worthwhile for smaller dictionaries (depending on memory
memory characteristics) but not for large dictionaries.

Perhaps we need to add some internal profiling, so that
"quickly-growing" dictionaries get larger reallocations ;)
...
Since the body of the loop isn't entered often, unpredictable one-shot
branches within the body shouldn't have a measurable effect.  The
unpredictable branches when physically resizing the dict will 
swamp them
regardless.  The surrounding if-test continues to be 
predictable in the
"branch taken" direction.
I didn't look at the surrounding code (bad Tim D - thwack!) but
in this case I would not expect an appreciable performance loss
from this. However, the fact that we're getting an appreciable
performance *gain* from changes on this branch suggests that it
might be slightly more vulnerable than expected (but should still be
swamped by the resize).
...
What could be much worse is that stuffing code into the 
if-block bloats the
code so much as to frustrate lookahead I-stream caching of the normal
"branch taken and return 0" path:
if (mp->ma_used > n_used && mp->ma_fill*3 >= 
(mp->ma_mask+1)*2) {
      if (dictresize(mp, mp->ma_used*2) != 0)
      	return -1;
  }
  return 0;
Rewriting as
if (mp->ma_used <= n_used || mp->ma_fill*3 < (mp->ma_mask+1)*2)
      return 0;
return dictresize(mp, mp->ma_used*2) ? -1 : 0;
would help some compilers generate better code for the 
expected path, and
especially if the blob after "return 0;" got hairier.
I find that considerably easier to read in any case ;)

Cheers.

Tim Delaney

RE: [Python-Dev] Dictionary tuning

Delaney, Timothy C (Timothy)

Raymond Hettinger

Tim Peters

Raymond Hettinger

Tim Peters

tags

participants (3)