[Python-ideas] thread safe dictionary initialisation from mappings: dict(non_dict_mapping)
Anselm Kruis
a.kruis at science-computing.de
Tue Nov 20 10:33:53 CET 2012
Am 19.11.2012 22:09, schrieb Terry Reedy:
> On 11/19/2012 12:24 PM, Anselm Kruis wrote:
>> Hello,
>>
>> I found the following annoying behaviour of dict(non_dict_mapping) and
>> dict.update(non_dict_mapping), if non_dict_mapping implements
>> collections.abc.Mapping but is not an instance of dict. In this case the
>> implementations of dict() and dict.update() use PyDict_Merge(PyObject
>> *a, PyObject *b, int override).
>>
>> The essential part of PyDict_Merge(a,b, override) is
>>
>> # update dict a with the content of mapping b.
>> keys = b.keys()
>> for key in keys:
>> ...
>> a[key] = b.__getitem__(key)
>>
>> This algorithm is susceptible to race conditions, if a second thread
>> modifies the source mapping b between "b.keys()" and b.__getitem__(key):
>> - If the second thread deletes an item from b, PyDict_Merge fails with a
>> KeyError exception.
>> - If the second thread inserts a new value and then modifies an existing
>> value, a contains the modified value but not the new value.
>
> It is well-known that mutating a collection while iterating over it can
> lead to unexpected or undesired behavior, including exceptions. This is
> not limited updating a dict from a non-dict source. The generic answer
> is Don't Do That.
Actually that's not the case here: the implementation of dict does not
iterate over the collection while another thread mutates the collection.
It iterates over a list of the keys and this list does not change.
>
>> Of course the current behaviour is the best you can get with a "minimum
>> mapping interface".
>
> To me, if you know that the source in d.update(source) is managed (and
> mutated) in another thread, the obvious solution (to Not Do That) is to
> lock the source. This should work for any source and for any similar
> operation. What am I missing?
> Instead, you propose to add a specialized, convoluted method that only
> works for updates of dicts by non_dict_mappings that happen to have a
> new and very specialized magic method that automatically does the lock.
> Sorry, I don't see the point. It is not at all a generic solution to a
> generic problem.
It is the automatic locking. For list- and set-like collections it is
already possible to implement this kind of automatic locking, because
iterating over them returns the complete information. Mappings are
special because of their key-value items.
If automatic locking of a collection is the right solution to a
particular problem, depends on the problem. There are problems, where
automatic locking is the best choice. I think, python should support it.
(If my particular applications belongs to this class of problems is
another question and not relevant here.)
Regards
Anselm
--
Dipl. Phys. Anselm Kruis science + computing ag
Senior Solution Architect Ingolstädter Str. 22
email A.Kruis at science-computing.de 80807 München, Germany
phone +49 89 356386 874 fax 737 www.science-computing.de
--
Vorstandsvorsitzender/Chairman of the board of management:
Gerd-Lothar Leonhart
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Michael Heinrichs,
Dr. Arno Steitz, Dr. Ingrid Zech
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196
More information about the Python-ideas
mailing list