[Python-ideas] thread safe dictionary initialisation from mappings: dict(non_dict_mapping)
tjreedy at udel.edu
Mon Nov 19 22:09:57 CET 2012
On 11/19/2012 12:24 PM, Anselm Kruis wrote:
> I found the following annoying behaviour of dict(non_dict_mapping) and
> dict.update(non_dict_mapping), if non_dict_mapping implements
> collections.abc.Mapping but is not an instance of dict. In this case the
> implementations of dict() and dict.update() use PyDict_Merge(PyObject
> *a, PyObject *b, int override).
> The essential part of PyDict_Merge(a,b, override) is
> # update dict a with the content of mapping b.
> keys = b.keys()
> for key in keys:
> a[key] = b.__getitem__(key)
> This algorithm is susceptible to race conditions, if a second thread
> modifies the source mapping b between "b.keys()" and b.__getitem__(key):
> - If the second thread deletes an item from b, PyDict_Merge fails with a
> KeyError exception.
> - If the second thread inserts a new value and then modifies an existing
> value, a contains the modified value but not the new value.
It is well-known that mutating a collection while iterating over it can
lead to unexpected or undesired behavior, including exceptions. This is
not limited updating a dict from a non-dict source. The generic answer
is Don't Do That.
> Of course the current behaviour is the best you can get with a "minimum
> mapping interface".
To me, if you know that the source in d.update(source) is managed (and
mutated) in another thread, the obvious solution (to Not Do That) is to
lock the source. This should work for any source and for any similar
operation. What am I missing?
Instead, you propose to add a specialized, convoluted method that only
works for updates of dicts by non_dict_mappings that happen to have a
new and very specialized magic method that automatically does the lock.
Sorry, I don't see the point. It is not at all a generic solution to a
Terry Jan Reedy
More information about the Python-ideas