Thread-safe way to add a key to a dict only if it isn't already there?
steve+comp.lang.python at pearwood.info
Sun Jul 8 06:50:12 EDT 2018
On Sun, 08 Jul 2018 10:52:15 +0300, Marko Rauhamaa wrote:
> You are on the right track, Chris, but you are still deducing behavior
> from a particular implementation. For example Java's definition is
> approximately the one I give above:
> Without correct synchronization, very strange, confusing and
> counterintuitive behaviors are possible.
Indeed, and that applies to all languages with threading, including
Python. But notice that it doesn't say:
Without correct synchronization, impossible things are possible.
> <URL: https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.htm
You seem to have read this as "Java threading can do impossible things",
but I think the correct reading of the article is "Here are the possible
things that Java threading can do to mess with your mind". That's a much
smaller subset of things which can go wrong.
(But still pretty big.)
See, for example Example 17.4-1, which explicitly shows that the Java
execution model is permitted to reorder statements:
r1 = A; B = 1;
may be reordered to
B = 1; r1 = A;
That's harmless in single-threaded code, but risky in multi-threaded
code. The result is that in the example given, r1 could end up being set
to 1 instead of the expected values -- but it cannot possibly be set to
the string "impossible", since that goes against the Java type system.
In Python, there's no possible order of operations of assigning 1 and
adding 1 that could result in a string. No Python compiler or interpreter
is permitted to evaluate `1 + 1` or `2 + 1` as the string "impossible",
even if threads are involved.
(That's not to say that threads might not assign a string to the variable
we expected to contain 2, but it would have to do so by assignment, not
by mucking about with the execution order of integer addition.)
In the case of Example 17.4-1, starting with A = B = 0, two threads
Thread 1: r2 = A; B = 1;
Thread 2: r1 = B; A = 2;
Java compilers are permitted to inspect each thread *in isolation*,
decide that the order of lines is invisible to the caller, and so re-
order them. The result of that is that those four lines can be executed
in any permutation, giving a total of 24 separate possibilities.
py> from itertools import permutations
py> instructions = ["r2 = A;", "B = 1;", "r1 = B;", "A = 2;"]
But Python (at least for now) cannot do this. It has a more deterministic
execution order which guarantees top-to-bottom execution of Python
statements. That implies that there are only *six* possible execution
orders of those two threads (which is still enough to add non-determinism
to your code!) not twenty-four:
r2 = A; B = 1; r1 = B; A = 2;
r2 = A; r1 = B; B = 1; A = 2;
r2 = A; r1 = B; A = 2; B = 1;
r1 = B; A = 2; r2 = A; B = 1;
r1 = B; r2 = A; A = 2; B = 1;
r1 = B; r2 = A; A = 2; B = 1;
and that remains true even if the underlying implementation is written in
Java, or Malbolge for that matter.
> And we know that Python has been implemented using Java...
That's irrelevant. A legal Python does not inherit the quirks of the
underlying implementation -- it must still follow the rules of the
language, or it isn't Python. Java is statically typed, with machine
ints, Python is dynamically typed, with no machine ints.
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson
More information about the Python-list