[Python-ideas] Fwd: Concurrent safety?
Mike Meyer
mwm at mired.org
Thu Nov 3 17:38:26 CET 2011
On Thu, Nov 3, 2011 at 12:39 AM, Bruce Leban <bruce at leapyear.org> wrote:
> Here are issues to think about. You wrote:
>
> 'locking' value [',' value]*':' suite
> The list of values are the objects that can be mutated in this lock. An
> immutable object showing up in the list of values is a TypeError.
>
> However:
> (1) Greg Ewing gave an example sort of like this:
>
> new_balance = balance + deposit
> lock(balance)
> balance = new_balance
> unlock(balance)
>
> and pointed out that the locks don't help
Yup. You can't keep people from writing code that's just wrong. But
they can't get it right if they don't add any locking at all.
> lock(balance)
> new_balance = balance + deposit
> balance = new_balance
> unlock(balance)
This case would still throw an exception, because what needs to be
locked isn't balance, but whatever balance is an attribute of. Unless
it's a local variable, in which case it doesn't need to be locked.
Given the code as is, balance needing to be locked would make it
global, so you'd lock the module. A more realistic implementation
would be if balance were self.balance, in which case you'd lock self.
> Consider this operating on lists:
>
> locking stuff: #1
> stuff += added_stuff
> locking stuff: #2
> stuff = stuff + added_stuff
>
> In #2, locking is completely useless. Sure the values are mutable but I'm
> not mutating them. Is it obvious that #1 works and #2 doesn't? You don't
> want to lock the *value* of stuff, you want to lock the *variable*, i.e.,
> locking globals() or wherever the value of stuff is found.
Right. Possibly "value" was the wrong choice for a placeholder in the
original description, because what you're actually locking is a
container of some sort.
> Locking globals() seems like a bad idea. Which brings me to...
> (2) How does locking something like a dictionary work? Does it lock the
> dictionary and all the values in it, walking the entire data structure to
> lock it? Ypu suggest that's the case and that seems like a performance
> nightmare.
If I suggested that, then it was unintentional. Yes, it's a
performance nightmare. Locking a container just locks for changes to
set of contained objects. It's analogous to tuples being immutable:
you can't change the set of objects in the tuple, but if those objects
are mutable, you can change them. If you want to change an object in a
list/dictionary/etc - then you have to lock that object, but not the
dictionary.
Given
>
> x = [1, 2]
> d = {3: x, 4: 5}
>
> How do I lock d[3]? That is, I want to lock the dictionary slot where the
> value of d[3] is stored so another thread can't do d[3] = 0. If I write
>
> locking d[3]: # do stuff
>
> that's going to lock the value of x. Another thread won't be able to do
> x.append(0) but they would be able to do x = 0 or d[3] = 0. If I have to
> write
>
> locking d: # do stuff
>
> that hampers concurrency.
Yes, it does. But so does any form of locking.
This proposal locks objects. Changes that don't change the object -
even if they might change it's value by changing a contained object -
aren't locked, because of the expense you pointed out. If there's some
different level of granularity for locking that makes sense, I don't
know what it is.
> If I can lock the dictionary slot d[3], can I lock
> list slots? After all, the compiler doesn't know they type of d. How do I
> lock just an attribute without locking the whole object?
Attributes are names. You don't lock names, you lock objects. To
prevent binding of an attribute, you lock the object it's an attribute
of. If you want to prevent mutating the value of the object bound to
the attribute, you lock that object.
> (3) What happens if one thread does:
>
> locking d, x: # stuff
>
> and another does
>
> locking x, d: # stuff
That one I dealt with in the specification. There's an implementation
requirement that if two locking statements share objects, the shared
objects have to be locked in the same order. So one of the two will
lock things in the opposite of the order they appear in the statement.
> I think I know the answer and it's "deadlock".
No, avoiding deadlock is one of the goals. That's why that requirement
exists, and there's a further requirement that you can only nest
locking statements if the inner one locks a subset of the outer one.
I have as yet to work out how an automatic STM version (this wasn't in
the original proposal) would interact here.
<mike
More information about the Python-ideas
mailing list