On Mon, Nov 9, 2009 at 2:42 AM, "Martin v. Löwis" <martin@v.loewis.de> wrote:

Interesting. Something goes wrong, it seems: if items get removed over

and over again, I think the set should shrink (not sure whether it

actually does). Then, I think you should end up with an amortized O(1)

for selecting an element (assuming that the underlying hashes don't

collide).

I'm not sure if Python's implementation shrinks or not, but even if it did shrink, it would still be amortized O(n).

Imagine a completely full hash table that currently contains n elements in n slots (I know this cannot happen in Python's implementation but it's useful for illustrative purposes). Assume it will shrink when it gets down to n/2 elements.

Here is my pathological algorithm again, for reference purposes:

while s:

for i in s:

break

# Imagine a complex algorithm here that depends on i still being in s

s.remove(i)

We repeatedly search through the slots sequentially and remove the first element we find. The first removal requires checking 1 slot, the second removal requires checking 2 slots, the third removal requires checking 3 slots, and so forth. The hash table will shrink after the n/2-th removal, when we have checked 1 + 2 + 3 + ... + n/2 = O(n**2) slots for n/2 removals (or amortized O(n) per removal). It's too late for shrinking to save us; we've already performed too much work.

--

Daniel Stutzbach, Ph.D.

President, Stutzbach Enterprises, LLC