
On Sun, Jun 28, 2020 at 3:39 AM Steven D'Aprano steve@pearwood.info wrote:
TL;DR
If the `set.add` feature can return a flag stating whether or not the element was previously present or not cheaply and atomically, without slowing down the single-threaded case with locks, then that's great (as I already stated...) and this entire subthread about threading is irrelevant and you can stop reading now.
If not, then we should consider carefully the consequences of adding a feature that will either slow down the common single-threaded case, or potentially be an attractive nuisance in the multi-threaded case.
CPython has the GIL. Every other Python will have some equivalent way to ensure that its own internal data structures maintain their integrity.
At some point, there will be a bit of code somewhere that decides whether to change some bits in memory, or not change some bits in memory. In CPython, there's basically no way that this can be done safely without also guaranteeing the atomicity needed, and therefore there is no problem. Other implementations may have something slightly different, but the chances of a problem that doesn't also cause fundamental internal issues would be extremely low.
The worst case scenario that I can imagine is a non-refcounting Python (let's say it uses mark-and-sweep garbage collection), a set is defined as a hash table, and insertion into it is performed by writing a single pointer into a single cell in the base array. In a multithreaded situation, you could have two threads simultaneously try to add the same item, or simultaneously try to add two different items. In order to guarantee that adding two items won't accidentally overwrite one (which would imply that "s.add(1)" could silently discard a completely unrelated value), there would need to be a safe way to write one pointer into one cell with a guarantee that the cell had previously been empty. And that is, in fact, precisely what is needed for the proposed feature: "insert this, and be certain that it hadn't previously been here". This is basically what "lock cmpxchg" is for on the Intel CPUs, and equivalents on others.
There could be lock-free code before and after this (although in CPython, the GIL basically means there won't be), but at some point, there would need to be that "final action" that is atomic, and it *already* needs to be. This feature does not increase the atomicity requirements.
which is why people much smarter than me have spent decades designing languages and execution models specifically to deal with concurrency and parallel computing safely:
https://en.wikipedia.org/wiki/List_of_concurrent_and_parallel_programming_la...
Well... yes. That's exactly what I'm building on here. If you have a Python implementation that includes threads, then this functionality MUST already exist.
or simply recommending that, when possible, we ought to avoid mutable shared state (in the same way we avoid GOTOs and global variables and other anti-patterns):
https://jaxenter.com/akka-anti-patterns-shared-mutable-state-128827.html
https://exploringjs.com/deep-js/ch_shared-mutable-state.html
Right. Avoid it where possible, because it has many costs. But fundamentally it can't be completely eliminated without some ridiculous dances, and I'm assuming here that the set *is* being mutated from multiple threads simultaneously. If it isn't (either because it's local to one thread, or it's in a "one writer, many readers" model), it's easy, and the only place that needs to have any locking whatsoever is a hash resize or equivalent. This discussion is all about the worst case scenario: two threads concurrently trying to insert into the same set.
I'm not sure why this is so controversial. I'm making a moderate claim here, threading is much harder to get right than sequential code.
And I'm making a counter-claim: threading is an already-solved problem. Sure it's harder, but it's already harder and we're dealing with it.
Please let's not over-generalise here, I'm not opposed to learning in general, I'm pointing out that writing concurrent-style code for something which is trivially thought of as "if the element has not been seen, add it" is not the most natural way of thinking about the task.
Should you have to think about resource leakage with open() calls, or do you just learn to follow best practice and use a 'with' block? Sometimes we don't need to learn about concurrency to do the right thing, we just learn what's the right thing, and if we're curious as to WHY it's the right thing, we can *then* learn about concurrency.
So `add` would have to be atomic at the C level. If it is already atomic, great, that's one objection neutralised. But if not, then will making it atomic have a performance impact?
Question: If it ISN'T atomic at the C level, what happens?
Sometimes it will return False when it should have returned True, or True when it should have returned False, and because of that wrong answer, people will have mysterious bugs, crashes, failures, and code that silently does the wrong thing.
My point is that this cannot happen.
I would expect that set.add() is guaranteed to not add a duplicate, and that right there basically guarantees everything that I need.
Don't you also need the method to guarantee to return True only if the element was already in the set, and False only if the element was not in the set? That is the point of the proposed feature.
Propose a way that it could get the return value wrong, but couldn't also accidentally discard some other item, or corrupt fundamental internals (like CPython's reference counter).
No, I'm not. I'm only talking about this as a concept in things that reject duplicates. So at best, it would be sets and dicts. Why on earth would you assume I would want this operation on a list?
I commented that the "check, then insert" idiom works with lists etc as well as sets, but the "insert and return a flag" idiom doesn't work with lists etc. You countered that by saying that the "insert and return flag" would work equally well. It can't work at all if it doesn't exist. So for it to work just as well for lists as it works for sets, it has to be added to lists.
When do you need this for a list, and do you actually need this guarantee of atomicity?
Are you THAT desperate for an argument?
Is that what we're having? I thought we were discussing the pros and cons of a proposed feature. That's what the purpose of Python-Ideas is, or at least was.
That's the second time you've made that comment. It would be more considerate, open and empathic to assume I'm having a good faith discussion instead of accusing me of trolling.
I'm honestly not sure what your motivation is. You're clutching at tiny straws in order to debate a point. Either that, or you're so terrified of threading that you don't want to learn anything about it, and yet want to debate against it. Please, prove me wrong. I just keep finding you to be argumentative rather than actually trying to find the truth.
They have different semantics, and I can imagine cases where both would be useful. Although I suppose option 1 is moot if sets don't over-write the existing element.
They don't have different semantics, so the distinction is irrelevant.
In one the add method is unconditionally called. In the other, the add method is only sometimes called. The difference in semantics is clear if you consider that this will affect not only builtin sets, but also subclasses which may have side-effects or perform other computations.
You said yourself that option 1 is moot (since sets indeed don't overwrite). So the true semantic difference with core sets is that one of them has a bug in it (a TOCTOU flaw) and the other doesn't. With subclasses, well, anything could happen - the "in" operator could be implemented strictly backwards, or the "add" method could be written to throw an exception if today is Friday, or anything. We aren't defining semantics for every subclass, just for the core.
ChrisA