weakref.WeakKeyDictionary is (basically) useless

Let's talk about weakref.WeakKeyDictionary. First, what is it? It's a dictionary whose keys are referenced weakly. That is, the dictionary takes weak references to its keys. If a key is garbage collected, it magically vanishes from the dictionary. This saves programmers much of the trouble of manually culling dead weak references from data structures. Why would you use it? The official documentation mentions implementing a [descriptor][1]. See [this paste][2] for a simple example of this sort of thing. Turning the WeakKeyDictionary into a regular dictionary would be troublesome. Since the descriptor object will (probably) never die, if it used a normal dictionary, it would keep all/most/some/more-than-one instances of Bar() alive forever. While this simple example would be better implemented as a property, or even a bare attribute, it is nice if you have many separate Bar classes and complicated per-property processing. Why is it broken? To go back to our simple example, imagine someone subclasses Bar (let's call the subclass Baz) and implements __eq__() and/or __hash__(). Instances of Baz may not work at all (if __hash__ is None) or they may "work" but behave nonsensically (all values-equal instances of Baz will have the same .foo attribute - imagine trying to track that bug down!). While we could forbid such a subclass in our API docs, this runs afoul of the open/closed principle. What can we do about it? It's possible someone is relying on this behavior, so we probably should not change the existing class. Instead, we can offer a new class called, say, WeakKeyIDDictionary. Said class would use object identity instead of object equality/hashing to uniquify keys. It would also reference keys weakly, just like WeakKeyDictionary does now. Implementing such a class is somewhat tricky, since weak references are difficult to reason about correctly. I've written up a [simplistic prototype][3], but I'm concerned about both performance and correctness. Specifically, I know that I need to call _remove_dead_refs() periodically, but I don't see where I can do that without compromising performance (e.g. making key lookup O(n) is a Bad Idea). I also would note it looks nothing like the stdlib implementations of WeakKey/ValueDictionary, which finagle raw weakrefs directly. These problems are related, of course, since the stdlib uses finalizer callbacks to cull dead references as they die. Frankly, the stdlib's code looks incredibly hairy to me and I doubt I could correctly re-implement it. Finally, I've no idea if it's subject to race conditions or other issues (it is a completely untested example I threw together in 5 minutes). Because this class is simultaneously useful and hard to write correctly, it would be a nice addition to the standard library (probably not in the form I've written it!), or failing that, some other library. I've Googled for such a library and come up empty. Should I submit a bug, a PEP, or is this whole idea just stupid? [1]: https://docs.python.org/3/reference/datamodel.html#implementing-descriptors [2]: http://pastebin.com/90j2ZAqC [3]: http://pastebin.com/cpZh2bTa -- Kevin Norris

On Tue, 30 Dec 2014 19:22:36 -0500 Kevin Norris <nykevin.norris@gmail.com> wrote:
How about something like this (untested): import weakref class Foo: def __init__(self): self._data = {} def __get__(self, obj, owner): _, val = self._data[id(obj)] return val def __set__(self, obj, value): key = id(obj) try: ref, _ = self._data[key] except KeyError: def on_destroy(_): del self._data[key] ref = weakref.ref(obj, on_destroy) self._data[key] = ref, value def __delete__(self, obj): del self._data[id(obj)] class Bar: foo = Foo() Regards Antoine.

On Tue, 30 Dec 2014 19:22:36 -0500 Kevin Norris <nykevin.norris@gmail.com> wrote:
How about something like this (untested): import weakref class Foo: def __init__(self): self._data = {} def __get__(self, obj, owner): _, val = self._data[id(obj)] return val def __set__(self, obj, value): key = id(obj) try: ref, _ = self._data[key] except KeyError: def on_destroy(_): del self._data[key] ref = weakref.ref(obj, on_destroy) self._data[key] = ref, value def __delete__(self, obj): del self._data[id(obj)] class Bar: foo = Foo() Regards Antoine.
participants (3)
-
Antoine Pitrou
-
Benjamin Peterson
-
Kevin Norris