[Python-Dev] When do sets shrink?

Noam Raphael noamraph at gmail.com
Thu Dec 29 18:03:08 CET 2005


On 12/29/05, Fredrik Lundh <fredrik at pythonware.com> wrote:
> Noam Raphael wrote:
>
> > I'm not saying that practically it must be used - I'm just saying that
> > it can't be called a heuristic, and that it doesn't involve any "fancy
> > overkill size hinting or history tracking". It actually means
> > something like this:
> > 1. If you want to insert and the table is full, resize the table to
> > twice the current size.
> > 2. If you delete and the number of elements turns out to be less than
> > a quarter of the size of the table, resize the table to half of the
> > current size.
>
> sure sounds like a heuristic algorithm to me... (as in "not guaranteed to
> be optimal under all circumstances, even if it's probably quite good in all
> practical cases")

I'm not saying it's optimal, but it is really amortized O(1) per
insert/delete. I looked up in "Introduction to Algorithms" for this,
and it has a complicated explanation. A simple explanation is that
after every resize the table is exactly half-full. Let's say it has n
elements and the table size is 2*n. To get to the next resize, you
have to do at least n/2 removals of elements, or n insertion of
elements. After that, you do a resize operation. In either case, you
do an O(n) resize operation after at least O(n) insertions/removals
which are O(1) operations. This means that the toal cost remains O(n)
per n simple operations, which you can say is O(1) per simple
operation.

I hope that if you read this slowly it makes sense...

Noam


More information about the Python-Dev mailing list