
I've successfully embedded Python for a single thread I tried to extend the implementation for multiple threads (a worker thread scenario) and I'm encountering either deadlocks or seg faults depending upon how I got about it. There seems to be some inconsistency between what is covered in the docs here: http://docs.python.org/2/c-api/init.html#non-python-created-threads and the experiences of other users trying the same thing, e.g. http://bugs.python.org/issue19576 http://wiki.blender.org/index.php/Dev:2.4/Source/Python/API/Threads Can anybody comment on the situation, in particular, Is the non-python-created-threads documentation accurate for v2.7? If a main thread does things like importing a module and obtaining a reference to a Python method, can those things be used by other C++ threads or do they have to repeat those lookups? Is there any logic that needs to be executed once only as each thread is started? (the doc suggests just PyGILState_Ensure/PyGILState_Release each time a thread accesses Python methods - is there anything else?) Given the bug 19576, what is the most portable way to code this to work reliably on unfixed Python versions? (e.g. should users always explicitly call PyEval_InitThreads() in their main thread or worker threads or both?) Here is my actual source code: https://svn.resiprocate.org/viewsvn/resiprocate/main/repro/plugins/pyroute/ (see example.py for a trivial example of what it does) The problem that I encounter: - the init stuff runs fine in PyRoutePlugin.cxx, it calls Py_Initialize, PyEval_InitThreads, PyImport_ImportModule and looks up the "provide_route" method in the module it creates a PyRouteWorker object, giving it a reference to "provide_route" it creates a thread pool to run the worker - the PyRouteWorker::process() method is invoked in one of those threads - it crashes when trying to call the "provide_route" method PyRouteWorker.cxx: routes = mAction.apply(args); Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff30b8700 (LWP 23965)] 0x00007ffff3d6ad06 in PyObject_Call () from /usr/lib/libpython2.7.so.1.0 (gdb) bt #0 0x00007ffff3d6ad06 in PyObject_Call () from /usr/lib/libpython2.7.so.1.0 #1 0x00007ffff3d6b647 in PyEval_CallObjectWithKeywords () from /usr/lib/libpython2.7.so.1.0 #2 0x00007ffff414885a in apply (args=..., this=<optimized out>) at /usr/include/python2.7/CXX/Python2/Objects.hxx:3215 #3 repro::PyRouteWorker::process (this=0x6f00a0, msg=<optimized out>) at PyRouteWorker.cxx:98 #4 0x00007ffff7b879e1 in repro::WorkerThread::thread (this=0x68e110) at WorkerThread.cxx:36 #5 0x00007ffff70b7a2f in threadIfThreadWrapper (threadParm=<optimized out>) at ThreadIf.cxx:51 #6 0x00007ffff65ffb50 in start_thread (arg=<optimized out>) at pthread_create.c:304 #7 0x00007ffff5999a7d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #8 0x0000000000000000 in ?? ()

Another link that fills in some gaps and finally helped me make this work: http://www.codevate.com/blog/7-concurrency-with-embedded-python-in-a-multi-t... In particular, I found that PyGILState_Ensure/PyGILState_Release as described in the Python docs is not sufficient - as described in that blog link, I had to a) obtain PyInterpreterState from the first thread where Py_Initialize() was called b) when each worker thread starts, call PyThreadState_New(mInterpreterState) and save the result in a thread local mPyThreadState c) use the mPyThreadState with PyEval_RestoreThread and PyEval_SaveThread before and after calling Python methods Is this a bug in PyGILState_Ensure or is it a deficiency in the documentation? I also found one bug in my own code, although that was not related to the problem just described with PyGILState_Ensure and I had to fix both problems to make it work. Specifically, the PyWorkerThread constructor was taking an object argument when it should have taken a reference argument and this was creating an invalid Py::Callable member in my worker. On 18/12/13 00:19, Daniel Pocock wrote:
I've successfully embedded Python for a single thread
I tried to extend the implementation for multiple threads (a worker thread scenario) and I'm encountering either deadlocks or seg faults depending upon how I got about it.
There seems to be some inconsistency between what is covered in the docs here:
http://docs.python.org/2/c-api/init.html#non-python-created-threads
and the experiences of other users trying the same thing, e.g.
http://bugs.python.org/issue19576 http://wiki.blender.org/index.php/Dev:2.4/Source/Python/API/Threads
Can anybody comment on the situation, in particular,
Is the non-python-created-threads documentation accurate for v2.7?
If a main thread does things like importing a module and obtaining a reference to a Python method, can those things be used by other C++ threads or do they have to repeat those lookups?
Is there any logic that needs to be executed once only as each thread is started? (the doc suggests just PyGILState_Ensure/PyGILState_Release each time a thread accesses Python methods - is there anything else?)
Given the bug 19576, what is the most portable way to code this to work reliably on unfixed Python versions? (e.g. should users always explicitly call PyEval_InitThreads() in their main thread or worker threads or both?)
Here is my actual source code:
https://svn.resiprocate.org/viewsvn/resiprocate/main/repro/plugins/pyroute/
(see example.py for a trivial example of what it does)
The problem that I encounter:
- the init stuff runs fine in PyRoutePlugin.cxx, it calls Py_Initialize, PyEval_InitThreads, PyImport_ImportModule and looks up the "provide_route" method in the module it creates a PyRouteWorker object, giving it a reference to "provide_route" it creates a thread pool to run the worker
- the PyRouteWorker::process() method is invoked in one of those threads
- it crashes when trying to call the "provide_route" method PyRouteWorker.cxx: routes = mAction.apply(args);
Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff30b8700 (LWP 23965)] 0x00007ffff3d6ad06 in PyObject_Call () from /usr/lib/libpython2.7.so.1.0 (gdb) bt #0 0x00007ffff3d6ad06 in PyObject_Call () from /usr/lib/libpython2.7.so.1.0 #1 0x00007ffff3d6b647 in PyEval_CallObjectWithKeywords () from /usr/lib/libpython2.7.so.1.0 #2 0x00007ffff414885a in apply (args=..., this=<optimized out>) at /usr/include/python2.7/CXX/Python2/Objects.hxx:3215 #3 repro::PyRouteWorker::process (this=0x6f00a0, msg=<optimized out>) at PyRouteWorker.cxx:98 #4 0x00007ffff7b879e1 in repro::WorkerThread::thread (this=0x68e110) at WorkerThread.cxx:36 #5 0x00007ffff70b7a2f in threadIfThreadWrapper (threadParm=<optimized out>) at ThreadIf.cxx:51 #6 0x00007ffff65ffb50 in start_thread (arg=<optimized out>) at pthread_create.c:304 #7 0x00007ffff5999a7d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #8 0x0000000000000000 in ?? () _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/daniel%40pocock.com.au

On Wed, Dec 18, 2013 at 9:26 PM, Daniel Pocock <daniel@pocock.com.au> wrote:
b) when each worker thread starts, call PyThreadState_New(mInterpreterState) and save the result in a thread local mPyThreadState
c) use the mPyThreadState with PyEval_RestoreThread and PyEval_SaveThread before and after calling Python methods
Is this a bug in PyGILState_Ensure or is it a deficiency in the documentation?
It doesn't surprise me that you would need to do step b - I do seem to recall the need to call that for each new thread. Not so sure about c. Once you fixed the unrelated bug, do you still need that step? (Been a while since I last embedded Python though, and I might well be wrong.) ChrisA

On 18/12/13 16:02, Chris Angelico wrote:
On Wed, Dec 18, 2013 at 9:26 PM, Daniel Pocock <daniel@pocock.com.au> wrote:
b) when each worker thread starts, call PyThreadState_New(mInterpreterState) and save the result in a thread local mPyThreadState
c) use the mPyThreadState with PyEval_RestoreThread and PyEval_SaveThread before and after calling Python methods
Is this a bug in PyGILState_Ensure or is it a deficiency in the documentation? It doesn't surprise me that you would need to do step b - I do seem to recall the need to call that for each new thread. Not so sure about c. Once you fixed the unrelated bug, do you still need that step? (Been a while since I last embedded Python though, and I might well be wrong.)
Yes, I definitely needed to use this PyThreadState_New call even after my unrelated bug fix Should it be added to the documentation? I created a C++ wrapper around this logic, it is here: https://github.com/resiprocate/resiprocate/blob/master/repro/plugins/pyroute... and the use case is something like: // in constructor: PyExternalUser* mPyUser = new PyExternalUser(mInterpreterState); and each time Python calls are made, just do: { PyExternalUser::Use use(*mPyUser); // now call Python code } When the PyExternalUser::Use instance is created it does PyEval_RestoreThread() When the PyExternalUser::Use instance goes out of scope it is destroyed and PyEval_SaveThread() is called

On Wed, 18 Dec 2013 00:19:23 +0100 Daniel Pocock <daniel@pocock.com.au> wrote:
If a main thread does things like importing a module and obtaining a reference to a Python method, can those things be used by other C++ threads or do they have to repeat those lookups?
The C++ threads must use the PyGILState API to initialize corresponding Python thread states, and to hold the GIL. However, you don't have to *repeat* the lookups - pointers valid in one thread are valid in other threads, as long as you own a strong reference to the PyObject (beware functions which return a borrowed reference).
Is there any logic that needs to be executed once only as each thread is started? (the doc suggests just PyGILState_Ensure/PyGILState_Release each time a thread accesses Python methods - is there anything else?)
If you use the PyGILState API, there shouldn't be anything else.
(e.g. should users always explicitly call PyEval_InitThreads() in their main thread or worker threads or both?)
You only need to call PyEval_InitThreads() once in the main Python thread. Regards Antoine.

2013/12/18 Antoine Pitrou <solipsis@pitrou.net>:
You only need to call PyEval_InitThreads() once in the main Python thread.
This is not well documented. For your information, PyGILState_Ensure() now calls PyEval_InitThreads() in Python 3.4, see: http://bugs.python.org/issue19576 Victor

On 18/12/13 16:29, Victor Stinner wrote:
2013/12/18 Antoine Pitrou <solipsis@pitrou.net>:
You only need to call PyEval_InitThreads() once in the main Python thread.
This is not well documented. For your information, PyGILState_Ensure() now calls PyEval_InitThreads() in Python 3.4, see: http://bugs.python.org/issue19576
I did see that - but from my own experience, I do not believe it is calling PyThreadState_New(..) and it is not even checking if PyThreadState_New(..) has ever been called for the active thread Consequently, the thread is blocked or there is a seg fault I've now written up a much more thorough overview of my experience on my blog: http://danielpocock.com/embedding-python-multi-threaded-cpp

On 19 December 2013 07:58, Daniel Pocock <daniel@pocock.com.au> wrote:
On 18/12/13 16:29, Victor Stinner wrote:
2013/12/18 Antoine Pitrou <solipsis@pitrou.net>:
You only need to call PyEval_InitThreads() once in the main Python thread.
This is not well documented. For your information, PyGILState_Ensure() now calls PyEval_InitThreads() in Python 3.4, see: http://bugs.python.org/issue19576
I did see that - but from my own experience, I do not believe it is calling PyThreadState_New(..) and it is not even checking if PyThreadState_New(..) has ever been called for the active thread
Consequently, the thread is blocked or there is a seg fault
I've now written up a much more thorough overview of my experience on my blog:
You absolutely should *NOT* have to call PyThreadState_New before calling PyGILState_Ensure, as it is designed to call it for you (see http://hg.python.org/cpython/file/2.7/Python/pystate.c#l598). If calling PyThreadState_New works, but calling PyGILState_Ensure does not, then something else is broken (such as not initialising the thread local storage for the GIL state APIs). I don't see anything in your article about how you ensure that the main thread of the application *before anything else related to the embedded Python happens* calls both Py_Initialize() and PyEval_InitThreads(). If you don't do that, then all bets are off in terms of multithreading support. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 19/12/13 12:22, Nick Coghlan wrote:
On 19 December 2013 07:58, Daniel Pocock <daniel@pocock.com.au> wrote:
On 18/12/13 16:29, Victor Stinner wrote:
2013/12/18 Antoine Pitrou <solipsis@pitrou.net>:
You only need to call PyEval_InitThreads() once in the main Python thread. This is not well documented. For your information, PyGILState_Ensure() now calls PyEval_InitThreads() in Python 3.4, see: http://bugs.python.org/issue19576
I did see that - but from my own experience, I do not believe it is calling PyThreadState_New(..) and it is not even checking if PyThreadState_New(..) has ever been called for the active thread
Consequently, the thread is blocked or there is a seg fault
I've now written up a much more thorough overview of my experience on my blog:
You absolutely should *NOT* have to call PyThreadState_New before calling PyGILState_Ensure, as it is designed to call it for you (see http://hg.python.org/cpython/file/2.7/Python/pystate.c#l598). If calling PyThreadState_New works, but calling PyGILState_Ensure does not, then something else is broken (such as not initialising the thread local storage for the GIL state APIs).
I don't see anything in your article about how you ensure that the main thread of the application *before anything else related to the embedded Python happens* calls both Py_Initialize() and PyEval_InitThreads(). If you don't do that, then all bets are off in terms of multithreading support.
I definitely do both of those things in the method PyRoutePlugin::init(..) It is in PyRoutePlugin.cxx: http://svn.resiprocate.org/viewsvn/resiprocate/main/repro/plugins/pyroute/Py...

On 19 December 2013 21:28, Daniel Pocock <daniel@pocock.com.au> wrote:
On 19/12/13 12:22, Nick Coghlan wrote:
I don't see anything in your article about how you ensure that the main thread of the application *before anything else related to the embedded Python happens* calls both Py_Initialize() and PyEval_InitThreads(). If you don't do that, then all bets are off in terms of multithreading support.
I definitely do both of those things in the method PyRoutePlugin::init(..)
It is in PyRoutePlugin.cxx:
http://svn.resiprocate.org/viewsvn/resiprocate/main/repro/plugins/pyroute/Py...
I can't see an immediately obvious explanation for why your current approach based on PyExternalUser::Use gets things working, while the PyThreadSupport approach fails immediately. However, you'd need to be able to reproduce the problem with a much simpler embedding application and without PyCXX involved anywhere to confirm it as a possible CPython bug, or to identify exactly what is missing from the current embedding initialisation instructions. The reason for that is the fact that the GIL state API is unit tested on a wide variety of platforms inside a fully initialised interpreter and that means we know this *does* work in the absence of external interference: http://hg.python.org/cpython/file/16bfddf5a091/Modules/_testcapimodule.c#l13... I also asked Graham Dumpleton (author of mod_wsgi, one of the more complex CPython embedding scenarios currently in existence) to take a look, and he didn't see any obvious explanation for the discrepancy either, so you may want to try a cut down implementation without PyCXX to see if that's the culprit. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (5)
-
Antoine Pitrou
-
Chris Angelico
-
Daniel Pocock
-
Nick Coghlan
-
Victor Stinner