WeakValueDict and threadsafety
88888 Dihedral
dihedral88888 at googlemail.com
Sat Dec 10 13:44:22 EST 2011
On Sunday, December 11, 2011 1:56:38 AM UTC+8, Darren Dale wrote:
> On Dec 10, 11:19 am, Duncan Booth <duncan.bo... at invalid.invalid>
> wrote:
> > Darren Dale <dsda... at gmail.com> wrote:
> > > I'm concerned that this is not actually thread-safe. When I no longer
> > > hold strong references to an instance of data, at some point the
> > > garbage collector will kick in and remove that entry from my registry.
> > > How can I ensure the garbage collection process does not modify the
> > > registry while I'm holding the lock?
> >
> > You can't, but it shouldn't matter.
> >
> > So long as you have a strong reference in 'data' that particular object
> > will continue to exist. Other entries in 'registry' might disappear while
> > you are holding your lock but that shouldn't matter to you.
> >
> > What is concerning though is that you are using `id(data)` as the key and
> > then presumably storing that separately as your `oid` value. If the
> > lifetime of the value stored as `oid` exceeds the lifetime of the strong
> > references to `data` then you might get a new data value created with the
> > same id as some previous value.
> >
> > In other words I think there's a problem here, but nothing to do with the
> > lock.
>
> Thank you for the considered response. In reality, I am not using
> id(data). I took that from the example in the documentation at
> python.org in order to illustrate the basic approach, but it looks
> like I introduced an error in the code. It should read:
>
> def get_data(oid):
> with reglock:
> data = registry.get(oid, None)
> if data is None:
> data = make_data(oid)
> registry[oid] = data
> return data
>
> Does that look better? I am actually working on the h5py project
> (bindings to hdf5), and the oid is an hdf5 object identifier.
> make_data(oid) creates a proxy object that stores a strong reference
> to oid.
>
> My concern is that the garbage collector is modifying the dictionary
> underlying WeakValueDictionary at the same time that my multithreaded
> code is trying to access it, producing a race condition. This morning
> I wrote a synchronized version of WeakValueDictionary (actually
> implemented in cython):
>
> class _Registry:
>
> def __cinit__(self):
> def remove(wr, selfref=ref(self)):
> self = selfref()
> if self is not None:
> self._delitem(wr.key)
> self._remove = remove
> self._data = {}
> self._lock = FastRLock()
>
> __hash__ = None
>
> def __setitem__(self, key, val):
> with self._lock:
> self._data[key] = KeyedRef(val, self._remove, key)
>
> def _delitem(self, key):
> with self._lock:
> del self._data[key]
>
> def get(self, key, default=None):
> with self._lock:
> try:
> wr = self._data[key]
> except KeyError:
> return default
> else:
> o = wr()
> if o is None:
> return default
> else:
> return o
>
> Now that I am using this _Registry class instead of
> WeakValueDictionary, my test scripts and my actual program are no
> longer producing segfaults.
I'll prefer to get iterators and iterables that can accept a global signal
called a clock to replace these CS mess.
More information about the Python-list
mailing list