Thread-safe way to add a key to a dict only if it isn't already there?
Marko Rauhamaa
marko at pacujo.net
Sun Jul 8 16:42:28 EDT 2018
Chris Angelico <rosuav at gmail.com>:
> On Mon, Jul 9, 2018 at 5:18 AM, Marko Rauhamaa <marko at pacujo.net> wrote:
>> Chris Angelico <rosuav at gmail.com>:
>>> Are you assuming that Python's semantics are defined by the semantics
>>> of one possible implementation language?
>>
>> What are Python's semantics defined by? I've been using these:
>>
>> <URL: https://docs.python.org/3/reference/>
>>
>> <URL: https://docs.python.org/3/library/>
>>
>> Unfortunately, neither spec says anything about the atomicity of
>> dict.setdefault().
>>
>> Therefore, the application programmer must assume it is not atomic. In
>> fact, as brought up in this discussion, the consultation of
>> object.__hash__() and object.__eq__() almost guarantee the
>> *non*-atomicity of dict.setdefault().
>
> If by "atomic" you mean that absolutely no context switch can occur
> during setdefault, then it probably isn't. But the point of an atomic
> query/update operation is that there are exactly two possibilities:
>
> 1) The key did not exist in the dictionary. It now does, with the
> provided default, which was returned.
> 2) The key did exist in the dictionary. The provided default is
> ignored, and the previous value is returned.
This is a classic test-and-set race condition:
# Initially
assert "A" not in d
# Thread 1
d.setdefault("A", 1)
# Thread 2
d["A"] = 2
If dict.setdefault() is atomic, the end result in all timings must be:
d["A"] == 2
However, if dict.setdefault() is not atomic, the end result may also be:
d["A"] == 1
> Neither object.__hash__ nor object.__eq__ gives any way for this to be
> violated (unless you mess with the concept of "the key did exist in
> the dictionary" by breaking the definition of equality, but that's
> nothing to do with atomicity). Here's the definition of setdefault:
>
> setdefault(key, default=None, /) method of builtins.dict instance
> Insert key with a value of default if key is not in the dictionary.
>
> Return the value for key if key is in the dictionary, else default.
>
> Simple semantics. Straight-forward. Now, maybe there's a bug in some
> implementation whereby threads can violate this; but that would be a
> bug to be fixed, and nothing more. You should be able to assume that
> it will behave as stated.
Since atomicity is not guaranteed by the definition of the method, the
application programmer must be prepared for the end result to be 1 (or
even something completely different because the Language Specification
doesn't say anything about the outcome of data races).
Marko
More information about the Python-list
mailing list