[Python-Dev] When do sets shrink?

Raymond Hettinger raymond.hettinger at verizon.net
Sat Dec 31 21:52:39 CET 2005


[Noam]
> For example, iteration over a set which once had
> 1,000,000 members and now has 2 can take 1,000,000 operations every
> time you traverse all the (2) elements.

Do you find that to be a common or plausible use case?

Was Guido's suggestion of s=set(s) unworkable for some reason?  dicts
and sets emphasize fast lookups over fast iteration -- apps requiring
many iterations over a collection may be better off converting to a list
(which has no dummy entries or empty gaps between entries).

Would the case be improved by incurring the time cost of 999,998 tests
for possible resizing (one for each pop) and some non-trivial number of
resize operations along the way (each requiring a full-iteration over
the then current size)?

Even if this unique case could be improved, what is the impact on common
cases?  Would a downsizing scheme risk thrashing with the
over-allocation scheme in cases with mixed adds and pops?

Is there any new information/research beyond what has been obvious from
the moment the dict resizing scheme was born?



Raymond



More information about the Python-Dev mailing list