On Mar 15, 2012 1:53 AM, "Masklinn" <masklinn@masklinn.net> wrote:
On 2012-03-14, at 18:36 , Matt Joiner wrote:
set.add(x) could return True if x was added to the set, and False if x was already in the set.
That does not mesh with the usual Python semantics of methods either having a side-effect (mutation) or returning a value. Why would that happen with sets but not with e.g. dicts?
Because dict insertions are by operator?
Adding an element that is already present often constitutes an error in
my code.
Then thrown an error when that happens?
As I understand, set.add is an atomic operation. Having set.add return a boolean will also allow EAFP-style code with regard to handling duplicates, the long winded form of which is currently:
if a not in b: b.add(a) <-- race condition do_c()
Which can be improved to:
if b.add(a): do_c()
Advantages: * Very common code pattern. * More concise. * Allows interpreter atomicity to be exploited, often removing the need for additional locking. * Faster because it avoids double contain check, and can avoid locking.
Nope, as Andrew noted it's possible that an other thread has *removed* the element from the set before do_c (so you've got your race condition right there, assuming do_c expects `a` to be in `b`)
And since you're using a set `b.add(a)` is a noop if `a` is already in `b` so there's no race condition at the point you note. The race condition is instead in the condition itself, you can have two different threads finding out the value is not in the set and ultimately executing do_c.
There's still a performance cost in looking up the already present value a second time.