Thread-safe way to add a key to a dict only if it isn't already there?
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Sat Jul 7 21:04:18 EDT 2018
On Sun, 08 Jul 2018 00:00:26 +0300, Marko Rauhamaa wrote:
> Ian Kelly <ian.g.kelly at gmail.com>:
>> the leaning of the devs seems to be to refrain from documenting it and
>> instead document that *no* operations are guaranteed atomic.
>
> I believe that to be wise. Otherwise, Python would limit its future
> implementation options.
o_O
By that logic, one should say we shouldn't document the semantics of
operations, so we don't limit the implementation.
"What does list.sort do?"
"It currently sorts the list using a stable comparison sort, but you
can't rely on that it, because we didn't want to limit the
implementation."
Sorting is guaranteed to be stable. Does that limit the implementation
options? Yes, of course it does. So does the fact that it *sorts* --
we're limited to implementations which *sort*, and specifically
comparison sorts (not, for example, radix sorts).
Changing the implementation to a radix sort that only works on ints would
break things. So would changing the implementation to something which
empties the list of all elements. ("The empty list is always sorted.")
Stable semantics are more important than freedom to change implementation.
Whether or not an operation is thread-safe, or atomic, is part of the
semantics of the operation. It's not merely a performance issue, and we
ought to treat it will a bit more respect.
I'm not saying that everything needs to be documented as thread-safe or
not (for at least one release, Python's sorting actually was stable
before it was documented as such) but the gung-ho attitude that safety is
just an implementation detail should be resisted.
Changing implementations from one which is thread safe to one which is
not can break people's code, and shouldn't be done on a whim. Especially
since such breakage could be subtle, hard to notice, harder to track
down, and even harder still to fix.
> The only thing Python should guarantee is that the data structures stay
> "coherent" under race conditions. In other words, there cannot be a
> segmentation fault. For example, if two threads executed this code in
> parallel:
>
> global i
> i = 1
> i += 1
>
> a legal end result could be that i contains the string "impossible".
That wouldn't be coherent. The only coherent results are that i could
equal either 2 or 3:
Possibility 1:
i = 1 # thread 1
i += 1 # thread 1
i = 1 # thread 2
i += 1 # thread 2
assert i == 2
Possibility 2:
i = 1 # thread 1
i = 1 # thread 2
i += 1 # thread 1
i += 1 # thread 2
assert i == 3
Executing statements out of order is impossible, even in threaded code.
Redefining the meaning of the integer literal 1 is impossible (ignoring
unsupported shenanigans with ctypes). Monkey-patching the int __iadd__
method is impossible. So there is no coherent way to get a result of
"impossible" from just adding 1 to 1 in any coherent implementation of
Python.
--
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson
More information about the Python-list
mailing list