Software Transactional Memory for Python
Hi all, About multithreading models: I recently made an observation which might be obvious to some, but not to me, and as far as I know not to most of us either. I think that it's worth being pointed out :-) http://mail.python.org/pipermail/pypy-dev/2011-August/008153.html A bientôt, Armin.
On Sat, Aug 27, 2011 at 8:45 PM, Armin Rigo
Hi all,
About multithreading models: I recently made an observation which might be obvious to some, but not to me, and as far as I know not to most of us either. I think that it's worth being pointed out :-)
http://mail.python.org/pipermail/pypy-dev/2011-August/008153.html
Having a context manager to say "don't release the GIL" for a bit could actually be really nice (e.g. for implementing builtin-style method semantics for data types written in Python). However, two immediate questions come to mind: 1. How does the patch interact with C code that explicitly releases the GIL? (e.g. IO commands inside a "with atomic:" block) 2. Whether or not Jython and IronPython could implement something like that, since they're free threaded with fine-grained locks. If they can't then I don't see how we could justify making it part of the standard library. Interesting idea, though :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Hi Nick,
On Sat, Aug 27, 2011 at 2:40 PM, Nick Coghlan
1. How does the patch interact with C code that explicitly releases the GIL? (e.g. IO commands inside a "with atomic:" block)
As implemented, any code in a "with atomic" is prevented from explicitly releasing and reacquiring the GIL: the GIL remain acquired until the end of the "with" block. In other words Py_BEGIN_ALLOW_THREADS has no effect in a "with" block. This gives semantics that, in a full multi-core STM world, would be implementable by saying that if, in the middle of a transaction, you need to do I/O, then from this point onwards the transaction is not allowed to abort any more. Such "inevitable" transactions are already supported e.g. by RSTM, the C++ framework I used to prototype a C version (https://bitbucket.org/arigo/arigo/raw/default/hack/stm/c ).
2. Whether or not Jython and IronPython could implement something like that, since they're free threaded with fine-grained locks. If they can't then I don't see how we could justify making it part of the standard library.
Yes, I can imagine some solutions. I am no Jython or IronPython expert, but let us assume that they have a way to check synchronously for external events from time to time (i.e. if there is some equivalent to sys.setcheckinterval()). If they do, then all you need is the right synchronization: the thread that wants to start a "with atomic" has to wait until all other threads are paused in the external check code. (Again, like CPython's, this not a properly multi-core STM-ish solution, but it would give the right semantics. (And if it turns out that STM is successful in the future, Java will grow more direct support for it <wink>)) A bientôt, Armin.
On Sat, 27 Aug 2011 15:08:36 +0200
Armin Rigo
Hi Nick,
On Sat, Aug 27, 2011 at 2:40 PM, Nick Coghlan
wrote: 1. How does the patch interact with C code that explicitly releases the GIL? (e.g. IO commands inside a "with atomic:" block)
As implemented, any code in a "with atomic" is prevented from explicitly releasing and reacquiring the GIL: the GIL remain acquired until the end of the "with" block. In other words Py_BEGIN_ALLOW_THREADS has no effect in a "with" block.
You then risk deadlocks. Say: - thread A is inside a "with atomic" and calls a library function which tries to take lock L - thread B has already taken lock L and is currently executing an I/O function with GIL released - thread B then waits for the GIL (and hence depends on thread A going forward), while thread A waits for lock L (and hence depends on thread B going forward) Lock L could simply be the lock used by the file object (a Buffered{Reader,Writer,Random}) which thread B is reading or writing from. Regards Antoine.
Hi Antoine,
You then risk deadlocks. Say: (...)
Yes, it is indeed not a solution that co-operates transparently and deadlock-freely with regular locks. You risk the same kind of deadlocks as you would when using only locks. The reason is similar to threads that try to acquire two locks in succession. In your example:
- thread A is inside a "with atomic" and calls a library function which tries to take lock L
This is basically dangerous, because it corresponds to taking lock "GIL" and lock L, in that order, whereas the thread B takes lock L and plays around with lock "GIL" in the opposite order. I think a reasonable solution to avoid deadlocks is simply not to use explicit locks inside "with atomic" blocks. Generally speaking it can be regarded as wrong to do any action that causes an unbounded wait in a "with atomic" block, but the solution I chose to implement in my patch is to still allow them, because it doesn't make much sense to say that "print" or "pdb.set_trace()" are forbidden. A bientôt, Armin.
Hi Armin,
This is basically dangerous, because it corresponds to taking lock "GIL" and lock L, in that order, whereas the thread B takes lock L and plays around with lock "GIL" in the opposite order. I think a reasonable solution to avoid deadlocks is simply not to use explicit locks inside "with atomic" blocks.
The problem is that many locks are actually acquired implicitely. For example, `print` to a buffered stream will acquire the fileobject's mutex. Also, even if the code inside the "with atomic" block doesn't directly or indirectely acquire a lock, there's still the possibility of asynchronous code that acquire locks being executed in the middle of this block: for example, signal handlers are run on behalf of the main thread from the main eval loop and in certain other places, and the GC might kick in at any time.
Generally speaking it can be regarded as wrong to do any action that causes an unbounded wait in a "with atomic" block,
Indeed. cf
Hi Charles-François,
2011/8/27 Charles-François Natali
The problem is that many locks are actually acquired implicitely. For example, `print` to a buffered stream will acquire the fileobject's mutex.
Indeed. After looking more at the kind of locks used throughout the stdlib, I notice that in many cases a lock is acquired by code in the following simple pattern: Py_BEGIN_ALLOW_THREADS PyThread_acquire_lock(self->lock, 1); Py_END_ALLOW_THREADS If one thread is waiting in the END_ALLOW_THREADS for another one to release the GIL, but the other one is in a "with atomic" block and tries to acquire the same lock, deadlock. But the issue can be resolved: the first thread in the above example needs to notice that the other thread is in a "with atomic" block, and "be nice" and release the lock again. Then it waits until the "with atomic" block finishes, and tries again from the start. We could do this by putting the above pattern it own function (which makes some sense anyway, because the pattern is repeated left and right, and is often complicated by an additional "if (!PyThread_acquire_lock(self->lock, 0))" before); and then allowing that function to be overridden by the external 'stm' module. I suspect that I need to do a more thorough review of the stdlib to make sure (at least more than now) that all potential deadlocking places can be avoided with a similar refactoring. All in all, it seems that the patch to CPython itself will need to be more than just the few lines in ceval.c --- but still very reasonable both in size and in content. A bientôt, Armin.
Re-hi,
2011/8/29 Armin Rigo
The problem is that many locks are actually acquired implicitely. For example, `print` to a buffered stream will acquire the fileobject's mutex.
Indeed. (...) I suspect that I need to do a more thorough review of the stdlib (...)
I found a solution not involving any change in CPython, and updated the patch. The solution is to say that a "with atomic" block doesn't completely prevent other threads from re-acquiring the GIL, but only prevents them from proceeding to the following bytecode. So if another thread is currently suspended in a place that releases the GIL for other reasons, then this other thread can still be switched to as normal, and continue running until the end of the current bytecode. I think it's sane enough for the original purpose, and avoids most deadlock cases. A bientôt, Armin.
Maybe it'd be better to put 'atomic' in the threading module? On 2011-08-30, at 4:02 PM, Armin Rigo wrote:
Re-hi,
2011/8/29 Armin Rigo
: The problem is that many locks are actually acquired implicitely. For example, `print` to a buffered stream will acquire the fileobject's mutex.
Indeed. (...) I suspect that I need to do a more thorough review of the stdlib (...)
I found a solution not involving any change in CPython, and updated the patch. The solution is to say that a "with atomic" block doesn't completely prevent other threads from re-acquiring the GIL, but only prevents them from proceeding to the following bytecode. So if another thread is currently suspended in a place that releases the GIL for other reasons, then this other thread can still be switched to as normal, and continue running until the end of the current bytecode. I think it's sane enough for the original purpose, and avoids most deadlock cases.
A bientôt,
Armin. _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/yselivanov.ml%40gmail.com
Hi,
On Tue, Aug 30, 2011 at 11:33 PM, Yury Selivanov
Maybe it'd be better to put 'atomic' in the threading module?
'threading' is pure Python. But anyway the consensus is to not have 'atomic' at all in the stdlib, which means it is in its own 3rd-party extension module. Armin
On Sat, Aug 27, 2011 at 6:08 AM, Armin Rigo
Hi Nick,
On Sat, Aug 27, 2011 at 2:40 PM, Nick Coghlan
wrote: 1. How does the patch interact with C code that explicitly releases the GIL? (e.g. IO commands inside a "with atomic:" block)
As implemented, any code in a "with atomic" is prevented from explicitly releasing and reacquiring the GIL: the GIL remain acquired until the end of the "with" block. In other words Py_BEGIN_ALLOW_THREADS has no effect in a "with" block. This gives semantics that, in a full multi-core STM world, would be implementable by saying that if, in the middle of a transaction, you need to do I/O, then from this point onwards the transaction is not allowed to abort any more. Such "inevitable" transactions are already supported e.g. by RSTM, the C++ framework I used to prototype a C version (https://bitbucket.org/arigo/arigo/raw/default/hack/stm/c ).
2. Whether or not Jython and IronPython could implement something like that, since they're free threaded with fine-grained locks. If they can't then I don't see how we could justify making it part of the standard library.
Yes, I can imagine some solutions. I am no Jython or IronPython expert, but let us assume that they have a way to check synchronously for external events from time to time (i.e. if there is some equivalent to sys.setcheckinterval()). If they do, then all you need is the right synchronization: the thread that wants to start a "with atomic" has to wait until all other threads are paused in the external check code. (Again, like CPython's, this not a properly multi-core STM-ish solution, but it would give the right semantics. (And if it turns out that STM is successful in the future, Java will grow more direct support for it <wink>))
A bientôt,
Armin.
This sounds like a very interesting idea to pursue, even if it's late, and even if it's experimental, and even if it's possible to cause deadlocks (no news there). I propose that we offer a C API in Python 3.3 as well as an extension module that offers the proposed decorator. The C API could then be used to implement alternative APIs purely as extension modules (e.g. would a deadlock-detecting API be possible?). I don't think this needs a PEP, it's not a very pervasive change. We can even document the API as experimental. But (if I may trust Armin's reasoning) it's important to add support directly to CPython, as currently it cannot be done as a pure extension module. -- --Guido van Rossum (python.org/~guido)
Hi Guido,
On Sun, Aug 28, 2011 at 6:43 PM, Guido van Rossum
This sounds like a very interesting idea to pursue, even if it's late, and even if it's experimental, and even if it's possible to cause deadlocks (no news there). I propose that we offer a C API in Python 3.3 as well as an extension module that offers the proposed decorator.
Very good idea. http://bugs.python.org/issue12850 The extension module, called 'stm' for now, is designed as an independent 3rd-party extension module. It should at this point not be included in the stdlib; for one thing, it needs some more testing than my quick one-page hacks, and we need to seriously look at the deadlock issues mentioned here. But the patch to ceval.c above looks rather straightforward to me and could, if no subtle issue is found, be included in the standard CPython. A bientôt, Armin.
On Sun, 28 Aug 2011 09:43:33 -0700
Guido van Rossum
This sounds like a very interesting idea to pursue, even if it's late, and even if it's experimental, and even if it's possible to cause deadlocks (no news there). I propose that we offer a C API in Python 3.3 as well as an extension module that offers the proposed decorator. The C API could then be used to implement alternative APIs purely as extension modules (e.g. would a deadlock-detecting API be possible?).
We could offer the C API without shipping an extension module ourselves. I don't think we should provide (and maintain!) a Python API that helps users put themselves in all kind of nasty situations. There is enough misunderstanding around the GIL and multithreading already. Regards Antoine.
On Mon, Aug 29, 2011 at 5:20 AM, Antoine Pitrou
On Sun, 28 Aug 2011 09:43:33 -0700 Guido van Rossum
wrote: This sounds like a very interesting idea to pursue, even if it's late, and even if it's experimental, and even if it's possible to cause deadlocks (no news there). I propose that we offer a C API in Python 3.3 as well as an extension module that offers the proposed decorator. The C API could then be used to implement alternative APIs purely as extension modules (e.g. would a deadlock-detecting API be possible?).
We could offer the C API without shipping an extension module ourselves. I don't think we should provide (and maintain!) a Python API that helps users put themselves in all kind of nasty situations. There is enough misunderstanding around the GIL and multithreading already.
+1
participants (7)
-
Antoine Pitrou
-
Armin Rigo
-
Charles-François Natali
-
Gregory P. Smith
-
Guido van Rossum
-
Nick Coghlan
-
Yury Selivanov