[Python-Dev] py2.7: dictobject not properly resizing
mynameisfiber at gmail.com
Sat Mar 30 22:31:26 CET 2013
I was taking a look at dictobject.c and realized that the logic
controlling whether a resizedict will occur in
dict_set_item_by_hash_or_entry disallows for the shrinking of a
dictionary. This is contrary to what the comments directly above say:
771 /* If we added a key, we can safely resize. Otherwise just return!
772 * If fill >= 2/3 size, adjust size. Normally, this doubles or
773 * quaduples the size, but it's also possible for the dict to shrink
774 * (if ma_fill is much larger than ma_used, meaning a lot of dict
775 * keys have been * deleted).
The "bug" occures in the following conditional since we exit out of
the function without checking the relative magnitudes of ma_filled to
ma_used. Instead, we only check if we still have a correct loading
factor (and the "don't resize on modification" bit).
This can be fixed by changing the following conditional on line 785 to:
if (mp->ma_used <= n_used || (mp->ma_fill*3 < (mp->ma_mask+1)*2 &&
mp->ma_used*5 > mp->ma_fill))
The factor of 5 was chosen arbitrarily... I'm sure with some
benchmarking we could tune it to an optimal value for the internal use
of dictionaries. However, before I put this effort in I was wondering if
this is in fact desired behavior or if it is indeed a bug. At the
very least, the comments should be updated to reflect the actual
resizing dynamics of the dictionary.
More information about the Python-Dev