[Python-Dev] New methods for weakref.Weak*Dictionary types

Tim Peters tim.peters at gmail.com
Mon May 1 22:57:06 CEST 2006


[Fred L. Drake, Jr.]
> I'd like to commit this for Python 2.5:
>
> http://www.python.org/sf/1479988
>
> The WeakKeyDictionary and WeakValueDictionary don't
> provide any API to get just the weakrefs out, instead
> of the usual mapping API. This can be desirable when
> you want to get a list of everything without creating
> new references to the underlying objects at that moment.
>
> This patch adds methods to make the references
> themselves accessible using the API, avoiding requiring
> client code to have to depend on the implementation.
> The WeakKeyDictionary gains the .iterkeyrefs() and
> .keyrefs() methods, and the WeakValueDictionary gains
> the .itervaluerefs() and .valuerefs() methods.
>
> The patch includes tests and docs.

+1.  A real need for this is explained in ZODB's ZODB/util.py's
WeakSet class, which contains a WeakValueDictionary:

"""
    # Return a list of weakrefs to all the objects in the collection.
    # Because a weak dict is used internally, iteration is dicey (the
    # underlying dict may change size during iteration, due to gc or
    # activity from other threads).  as_weakref_list() is safe.
    #
    # Something like this should really be a method of Python's weak dicts.
    # If we invoke self.data.values() instead, we get back a list of live
    # objects instead of weakrefs.  If gc occurs while this list is alive,
    # all the objects move to an older generation (because they're strongly
    # referenced by the list!).  They can't get collected then, until a
    # less frequent collection of the older generation.  Before then, if we
    # invoke self.data.values() again, they're still alive, and if gc occurs
    # while that list is alive they're all moved to yet an older generation.
    # And so on.  Stress tests showed that it was easy to get into a state
    # where a WeakSet grows without bounds, despite that almost all its
    # elements are actually trash.  By returning a list of weakrefs instead,
    # we avoid that, although the decision to use weakrefs is now very
    # visible to our clients.
    def as_weakref_list(self):
        # We're cheating by breaking into the internals of Python's
        # WeakValueDictionary here (accessing its .data attribute).
        return self.data.data.values()
"""

As that implementation suggests, though, I'm not sure there's real
payback for the extra time taken in the patch's `valuerefs`
implementation to weed out weakrefs whose referents are already gone: 
the caller has to make this check anyway when it iterates over the
returned list of weakrefs.  Iterating inside the implementation, to
build the list via itervalues(), also creates that much more
vulnerability to "dict changed size during iteration" multi-threading
surprises.  For that last reason, if the patch went in as-is, I expect
ZODB would still need to "cheat"; obtaining the list of weakrefs
directly via plain .data.values() is atomic, and so immune to these
multi-threading surprises.


More information about the Python-Dev mailing list