[Python-Dev] New methods for weakref.Weak*Dictionary types

Tim Peters tim.peters at gmail.com
Wed May 10 19:08:11 CEST 2006


[Tim Peters]
>> """
>>     # Return a list of weakrefs to all the objects in the collection.
>>     # Because a weak dict is used internally, iteration is dicey (the
>>     # underlying dict may change size during iteration, due to gc or
>>     # activity from other threads).

[Armin Rigo]
> But then, isn't the real problem the fact that applications cannot
> safely iterate over weak dicts?

Well, in the presence of threads, iterating over any dict (weak or
not) may blow up:  while some thread is iterating over the dict, other
threads can change the size of the dict, and then the dict iterator
will blow up the next time it's invoked.  In the context from which
that comment was extracted, that's a potential problem (which is why
the comment mentions "activity from other threads" :-)).

> This fact could be viewed as a bug, and fixed without API changes.
>  For example, I can imagine returning to the client an iterator that "locks" the
> dictionary.  Upon exhaustion, or via the __del__ of the iterator, or even in the
> 'finally:' part of the generator if that's how iteration is implemented, the dict is
> unlocked.
>
> Here "locking" means that weakrefs going away during this time are not
> eagerly removed from the dict; they will be removed only when the dict
> is unlocked.

That could remove one source of potential iteration surprises unique
to weak dicts, due to "magical" removal of dict entries (note that it
would probably need a thread-safe count of outstanding iterators, and
not "unlock the dict" until the count fell to 0).  Other threads could
still change the dict's size _seemingly_ "by magic" (from the dict
iterator's POV).  I don't know whether fixing "magical" weak-dict
removal without fixing "seemingly magical" weak-dict removal or
addition via other threads would be worth the bother.  Anyone burned
by either now has learned to avoid the iter{keys,values,items}()
methods.

Without more support in the dict implementation (and support that
would probably be difficult to add), the only thoroughly safe strategy
is to atomically materialize a hidden collection of the keys, values,
or items to be iterated over, and have the iterator march over those. 
In effect, apps do that themselves now by iterating over
.keys()/values()/items() instead of their .iterXYZ() versions.  Many
apps can get away without that, though, so there's value in keeping
the current "obvious" dict iterators.


More information about the Python-Dev mailing list