Sub-interpreters: importing numpy causes hang
python36.dll!_PyCOND_WAIT_MS(_PyCOND_T * cv=0x00000000748a67a0, _RTL_CRITICAL_SECTION * cs=0x00000000748a6778, unsigned long ms=5) Line 245 C [Inline Frame] python36.dll!PyCOND_TIMEDWAIT(_PyCOND_T *) Line 275 C
Hi all! I am new to the list and arriving with a concrete problem that I'd like to fix myself. I am embedding Python (3.6) into my C++ application and I would like to run Python scripts isolated from each other using sub-interpreters. I am not using threads; everything is supposed to run in the application's main thread. I noticed that if I create an interpreter, switch to it and execute code that imports numpy (1.13), my application will hang. ntdll.dll!NtWaitForSingleObject() Unknown KernelBase.dll!WaitForSingleObjectEx() Unknown python36.dll!take_gil(_ts * tstate=0x0000023251cbc260) Line 224 C python36.dll!PyEval_RestoreThread(_ts * tstate=0x0000023251cbc260) Line 370 C python36.dll!PyGILState_Ensure() Line 855 C umath.cp36-win_amd64.pyd!00007ff8c6306ab2() Unknown umath.cp36-win_amd64.pyd!00007ff8c630723c() Unknown umath.cp36-win_amd64.pyd!00007ff8c6303a1d() Unknown umath.cp36-win_amd64.pyd!00007ff8c63077c0() Unknown umath.cp36-win_amd64.pyd!00007ff8c62ff926() Unknown [Inline Frame] python36.dll!_PyObject_FastCallDict(_object *) Line 2316 C [Inline Frame] python36.dll!_PyObject_FastCallKeywords(_object *) Line 2480 C python36.dll!call_function(_object * * * pp_stack=0x00000048be5f5e40, __int64 oparg, _object * kwnames) Line 4822 C Numpy's extension umath calls PyGILState_Ensure(), which in turn calls PyEval_RestoreThread on the (auto) threadstate of the main interpreter. And that's wrong. We are already holding the GIL with the threadstate of our current sub-interpreter, so there's no need to switch. I know that the GIL API is not fully compatible with sub-interpreters, as issues #10915 and #15751 illustrate. But since I need to support calls to PyGILState_Ensure - numpy is the best example -, I am trying to improve the situation here: https://github.com/stephanreiter/cpython/commit/d9d3451b038af2820f500843b6a8... That change may be naive, but it does the trick for my use case. If totally wrong, I don't mind pursuing another alley. Essentially, I'd like to ask for some guidance in how to tackle this problem while keeping the current GIL API unchanged (to avoid breaking modules). I am also wondering how I can test any changes I am proposing. Is there a test suite for interpreters, for example? Thank you very much, Stephan
On Tue, 22 Jan 2019 15:32:22 +0100
Stephan Reiter
Numpy's extension umath calls PyGILState_Ensure(), which in turn calls PyEval_RestoreThread on the (auto) threadstate of the main interpreter. And that's wrong. We are already holding the GIL with the threadstate of our current sub-interpreter, so there's no need to switch.
I know that the GIL API is not fully compatible with sub-interpreters, as issues #10915 and #15751 illustrate.
That's a pity. Note that there is a patch on https://bugs.python.org/issue10915 that could probably solve the issue if it had been applied some years ago ;-) (yes, it needs C extension authors to use the new API, but Numpy is a well-maintained library and would probably have accepted a patch for that; so would Cython probably)
Essentially, I'd like to ask for some guidance in how to tackle this problem while keeping the current GIL API unchanged (to avoid breaking modules).
I'm not aware of any solution which does not require designing a new API, unfortunately.
I am also wondering how I can test any changes I am proposing. Is there a test suite for interpreters, for example?
You'll find a couple of them in test_embed.py, test_capi.py and test_threading.py. Regards Antoine.
There are currently numerous incompatibilities between numpy and
subinterpreters, and no concrete plan for fixing them. The numpy team does
not consider subinterpreters to be a supported configuration, and can't
help you with any issues you run into. I know the concept of
subinterpreters is really appealing, but unfortunately the CPython
implementation is not really mature or widely supported... are you
absolutely certain you need to use subinterpreters for your application?
On Tue, Jan 22, 2019, 08:27 Stephan Reiter Hi all! I am new to the list and arriving with a concrete problem that I'd
like to fix myself. I am embedding Python (3.6) into my C++ application and I would like
to run Python scripts isolated from each other using sub-interpreters.
I am not using threads; everything is supposed to run in the
application's main thread. I noticed that if I create an interpreter, switch to it and execute
code that imports numpy (1.13), my application will hang. python36.dll!_PyCOND_WAIT_MS(_PyCOND_T * cv=0x00000000748a67a0,
_RTL_CRITICAL_SECTION * cs=0x00000000748a6778, unsigned long ms=5) Line 245
C
[Inline Frame] python36.dll!PyCOND_TIMEDWAIT(_PyCOND_T *) Line 275 C ntdll.dll!NtWaitForSingleObject() Unknown
KernelBase.dll!WaitForSingleObjectEx() Unknown
python36.dll!take_gil(_ts * tstate=0x0000023251cbc260) Line 224 C
python36.dll!PyEval_RestoreThread(_ts * tstate=0x0000023251cbc260) Line
370 C
python36.dll!PyGILState_Ensure() Line 855 C
umath.cp36-win_amd64.pyd!00007ff8c6306ab2() Unknown
umath.cp36-win_amd64.pyd!00007ff8c630723c() Unknown
umath.cp36-win_amd64.pyd!00007ff8c6303a1d() Unknown
umath.cp36-win_amd64.pyd!00007ff8c63077c0() Unknown
umath.cp36-win_amd64.pyd!00007ff8c62ff926() Unknown
[Inline Frame] python36.dll!_PyObject_FastCallDict(_object *) Line 2316 C
[Inline Frame] python36.dll!_PyObject_FastCallKeywords(_object *) Line
2480 C
python36.dll!call_function(_object * * *
pp_stack=0x00000048be5f5e40, __int64 oparg, _object * kwnames) Line
4822 C Numpy's extension umath calls PyGILState_Ensure(), which in turn calls
PyEval_RestoreThread on the (auto) threadstate of the main
interpreter. And that's wrong.
We are already holding the GIL with the threadstate of our current
sub-interpreter, so there's no need to switch. I know that the GIL API is not fully compatible with sub-interpreters,
as issues #10915 and #15751 illustrate. But since I need to support calls to PyGILState_Ensure - numpy is the
best example -, I am trying to improve the situation here: https://github.com/stephanreiter/cpython/commit/d9d3451b038af2820f500843b6a8... That change may be naive, but it does the trick for my use case. If
totally wrong, I don't mind pursuing another alley. Essentially, I'd like to ask for some guidance in how to tackle this
problem while keeping the current GIL API unchanged (to avoid breaking
modules). I am also wondering how I can test any changes I am proposing. Is
there a test suite for interpreters, for example? Thank you very much,
Stephan
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
Thanks for the answers so far. I appreciate them!
Nathaniel, I'd like to allow Python plugins in my application. A
plugin should be allowed to bring its own modules along (i.e.
plugin-specific subdir is in sys.path when the plugin is active) and
hence some isolation of them will be needed, so that they can use
different versions of a given module. That's my main motivation for
using subinterpreters.
I thought about running plugins out-of-processes - a separate process
for every plugin - and allow them to communicate with my application
via RPC. But that makes it more complex to implement the API my
application will offer and will slow down things due to the need to
copy data.
Maybe you have another idea for me? :)
Henry, Antoine, thanks for your input; I'll check out the tests and
see what I can learn from issue 10915.
Stephan
Am Di., 22. Jan. 2019 um 22:39 Uhr schrieb Nathaniel Smith
There are currently numerous incompatibilities between numpy and subinterpreters, and no concrete plan for fixing them. The numpy team does not consider subinterpreters to be a supported configuration, and can't help you with any issues you run into. I know the concept of subinterpreters is really appealing, but unfortunately the CPython implementation is not really mature or widely supported... are you absolutely certain you need to use subinterpreters for your application?
On Tue, Jan 22, 2019, 08:27 Stephan Reiter
Hi all!
I am new to the list and arriving with a concrete problem that I'd like to fix myself.
I am embedding Python (3.6) into my C++ application and I would like to run Python scripts isolated from each other using sub-interpreters. I am not using threads; everything is supposed to run in the application's main thread.
I noticed that if I create an interpreter, switch to it and execute code that imports numpy (1.13), my application will hang.
python36.dll!_PyCOND_WAIT_MS(_PyCOND_T * cv=0x00000000748a67a0, _RTL_CRITICAL_SECTION * cs=0x00000000748a6778, unsigned long ms=5) Line 245 C [Inline Frame] python36.dll!PyCOND_TIMEDWAIT(_PyCOND_T *) Line 275 C
ntdll.dll!NtWaitForSingleObject() Unknown KernelBase.dll!WaitForSingleObjectEx() Unknown python36.dll!take_gil(_ts * tstate=0x0000023251cbc260) Line 224 C python36.dll!PyEval_RestoreThread(_ts * tstate=0x0000023251cbc260) Line 370 C python36.dll!PyGILState_Ensure() Line 855 C umath.cp36-win_amd64.pyd!00007ff8c6306ab2() Unknown umath.cp36-win_amd64.pyd!00007ff8c630723c() Unknown umath.cp36-win_amd64.pyd!00007ff8c6303a1d() Unknown umath.cp36-win_amd64.pyd!00007ff8c63077c0() Unknown umath.cp36-win_amd64.pyd!00007ff8c62ff926() Unknown [Inline Frame] python36.dll!_PyObject_FastCallDict(_object *) Line 2316 C [Inline Frame] python36.dll!_PyObject_FastCallKeywords(_object *) Line 2480 C python36.dll!call_function(_object * * * pp_stack=0x00000048be5f5e40, __int64 oparg, _object * kwnames) Line 4822 C
Numpy's extension umath calls PyGILState_Ensure(), which in turn calls PyEval_RestoreThread on the (auto) threadstate of the main interpreter. And that's wrong. We are already holding the GIL with the threadstate of our current sub-interpreter, so there's no need to switch.
I know that the GIL API is not fully compatible with sub-interpreters, as issues #10915 and #15751 illustrate.
But since I need to support calls to PyGILState_Ensure - numpy is the best example -, I am trying to improve the situation here: https://github.com/stephanreiter/cpython/commit/d9d3451b038af2820f500843b6a8...
That change may be naive, but it does the trick for my use case. If totally wrong, I don't mind pursuing another alley.
Essentially, I'd like to ask for some guidance in how to tackle this problem while keeping the current GIL API unchanged (to avoid breaking modules).
I am also wondering how I can test any changes I am proposing. Is there a test suite for interpreters, for example?
Thank you very much, Stephan _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
On Tue, Jan 22, 2019 at 6:33 PM Stephan Reiter
Thanks for the answers so far. I appreciate them!
Nathaniel, I'd like to allow Python plugins in my application. A plugin should be allowed to bring its own modules along (i.e. plugin-specific subdir is in sys.path when the plugin is active) and hence some isolation of them will be needed, so that they can use different versions of a given module. That's my main motivation for using subinterpreters. I thought about running plugins out-of-processes - a separate process for every plugin - and allow them to communicate with my application via RPC. But that makes it more complex to implement the API my application will offer and will slow down things due to the need to copy data. Maybe you have another idea for me? :)
Not really, sorry! I believe that most applications that support Python plugins (like blender, gimp, libreoffice, etc.), do it by using a single shared environment for all plugins. This is also how every application written in Python works, so at the ecosystem level there's a lot of pressure on module authors to make it possible to assemble them into a single coherent environment. -n -- Nathaniel J. Smith -- https://vorpus.org
On 1/23/19 3:33 AM, Stephan Reiter wrote:
Thanks for the answers so far. I appreciate them!
Nathaniel, I'd like to allow Python plugins in my application. A plugin should be allowed to bring its own modules along (i.e. plugin-specific subdir is in sys.path when the plugin is active) and hence some isolation of them will be needed, so that they can use different versions of a given module. That's my main motivation for using subinterpreters. I thought about running plugins out-of-processes - a separate process for every plugin - and allow them to communicate with my application via RPC. But that makes it more complex to implement the API my application will offer and will slow down things due to the need to copy data. Maybe you have another idea for me? :)
Try to make the plugins work together. Look into using pip/PyPI for your plugins. Try to make it so each package ("plugin") would have only one module/package, and dependencies would be other packages that can be installed individually and shared. And keep in mind you can set up your own package index, or distribute/install individual package files. If that's not possible, and you want things to work now, go with subprocess. If you want to help make subinterpreters work better, there are several people scratching at the problem from different angles. Most/all would welcome help, but don't expect any short-term benefits. (FWIW, my own effort is currently blocked on PEP 580, and I hope to move forward after a Council is elected.)
Henry, Antoine, thanks for your input; I'll check out the tests and see what I can learn from issue 10915.
Stephan
Am Di., 22. Jan. 2019 um 22:39 Uhr schrieb Nathaniel Smith
: There are currently numerous incompatibilities between numpy and subinterpreters, and no concrete plan for fixing them. The numpy team does not consider subinterpreters to be a supported configuration, and can't help you with any issues you run into. I know the concept of subinterpreters is really appealing, but unfortunately the CPython implementation is not really mature or widely supported... are you absolutely certain you need to use subinterpreters for your application?
On Tue, Jan 22, 2019, 08:27 Stephan Reiter
Hi all!
I am new to the list and arriving with a concrete problem that I'd like to fix myself.
I am embedding Python (3.6) into my C++ application and I would like to run Python scripts isolated from each other using sub-interpreters. I am not using threads; everything is supposed to run in the application's main thread.
I noticed that if I create an interpreter, switch to it and execute code that imports numpy (1.13), my application will hang.
python36.dll!_PyCOND_WAIT_MS(_PyCOND_T * cv=0x00000000748a67a0, _RTL_CRITICAL_SECTION * cs=0x00000000748a6778, unsigned long ms=5) Line 245 C [Inline Frame] python36.dll!PyCOND_TIMEDWAIT(_PyCOND_T *) Line 275 C
ntdll.dll!NtWaitForSingleObject() Unknown KernelBase.dll!WaitForSingleObjectEx() Unknown python36.dll!take_gil(_ts * tstate=0x0000023251cbc260) Line 224 C python36.dll!PyEval_RestoreThread(_ts * tstate=0x0000023251cbc260) Line 370 C python36.dll!PyGILState_Ensure() Line 855 C umath.cp36-win_amd64.pyd!00007ff8c6306ab2() Unknown umath.cp36-win_amd64.pyd!00007ff8c630723c() Unknown umath.cp36-win_amd64.pyd!00007ff8c6303a1d() Unknown umath.cp36-win_amd64.pyd!00007ff8c63077c0() Unknown umath.cp36-win_amd64.pyd!00007ff8c62ff926() Unknown [Inline Frame] python36.dll!_PyObject_FastCallDict(_object *) Line 2316 C [Inline Frame] python36.dll!_PyObject_FastCallKeywords(_object *) Line 2480 C python36.dll!call_function(_object * * * pp_stack=0x00000048be5f5e40, __int64 oparg, _object * kwnames) Line 4822 C
Numpy's extension umath calls PyGILState_Ensure(), which in turn calls PyEval_RestoreThread on the (auto) threadstate of the main interpreter. And that's wrong. We are already holding the GIL with the threadstate of our current sub-interpreter, so there's no need to switch.
I know that the GIL API is not fully compatible with sub-interpreters, as issues #10915 and #15751 illustrate.
But since I need to support calls to PyGILState_Ensure - numpy is the best example -, I am trying to improve the situation here: https://github.com/stephanreiter/cpython/commit/d9d3451b038af2820f500843b6a8...
That change may be naive, but it does the trick for my use case. If totally wrong, I don't mind pursuing another alley.
Essentially, I'd like to ask for some guidance in how to tackle this problem while keeping the current GIL API unchanged (to avoid breaking modules).
I am also wondering how I can test any changes I am proposing. Is there a test suite for interpreters, for example?
Thank you very much, Stephan _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/encukou%40gmail.com
Hi!
Well, the plugins would be created by third-parties and I'd like them
to enable bunding of modules with their plugins.
I am afraid of modules with the same name, but being different, or
different versions of modules being used by different plugins. If
plugins share an interpreter, the module with a given name that is
imported first sticks around forever and for all plugins.
I am thinking about this design:
- Plugins don't maintain state in their Python world. They expose
functions, my application calls them.
- Everytime I call into them, they are presented with a clean global
namespace. After the call, the namespace (dict) is thrown away. That
releases any objects the plugin code has created.
- So, then I could also actively unload modules they loaded. But I do
know that this is problematic in particular for modules that use
native code.
I am interested in both a short-term and a long-term solution.
Actually, making subinterpreters work better is pretty sexy ...
because it's hard. :-)
Stephan
Am Mi., 23. Jan. 2019 um 11:30 Uhr schrieb Petr Viktorin
On 1/23/19 3:33 AM, Stephan Reiter wrote:
Thanks for the answers so far. I appreciate them!
Nathaniel, I'd like to allow Python plugins in my application. A plugin should be allowed to bring its own modules along (i.e. plugin-specific subdir is in sys.path when the plugin is active) and hence some isolation of them will be needed, so that they can use different versions of a given module. That's my main motivation for using subinterpreters. I thought about running plugins out-of-processes - a separate process for every plugin - and allow them to communicate with my application via RPC. But that makes it more complex to implement the API my application will offer and will slow down things due to the need to copy data. Maybe you have another idea for me? :)
Try to make the plugins work together. Look into using pip/PyPI for your plugins. Try to make it so each package ("plugin") would have only one module/package, and dependencies would be other packages that can be installed individually and shared. And keep in mind you can set up your own package index, or distribute/install individual package files.
If that's not possible, and you want things to work now, go with subprocess.
If you want to help make subinterpreters work better, there are several people scratching at the problem from different angles. Most/all would welcome help, but don't expect any short-term benefits. (FWIW, my own effort is currently blocked on PEP 580, and I hope to move forward after a Council is elected.)
Henry, Antoine, thanks for your input; I'll check out the tests and see what I can learn from issue 10915.
Stephan
Am Di., 22. Jan. 2019 um 22:39 Uhr schrieb Nathaniel Smith
: There are currently numerous incompatibilities between numpy and subinterpreters, and no concrete plan for fixing them. The numpy team does not consider subinterpreters to be a supported configuration, and can't help you with any issues you run into. I know the concept of subinterpreters is really appealing, but unfortunately the CPython implementation is not really mature or widely supported... are you absolutely certain you need to use subinterpreters for your application?
On Tue, Jan 22, 2019, 08:27 Stephan Reiter
Hi all!
I am new to the list and arriving with a concrete problem that I'd like to fix myself.
I am embedding Python (3.6) into my C++ application and I would like to run Python scripts isolated from each other using sub-interpreters. I am not using threads; everything is supposed to run in the application's main thread.
I noticed that if I create an interpreter, switch to it and execute code that imports numpy (1.13), my application will hang.
python36.dll!_PyCOND_WAIT_MS(_PyCOND_T * cv=0x00000000748a67a0, _RTL_CRITICAL_SECTION * cs=0x00000000748a6778, unsigned long ms=5) Line 245 C [Inline Frame] python36.dll!PyCOND_TIMEDWAIT(_PyCOND_T *) Line 275 C
ntdll.dll!NtWaitForSingleObject() Unknown KernelBase.dll!WaitForSingleObjectEx() Unknown python36.dll!take_gil(_ts * tstate=0x0000023251cbc260) Line 224 C python36.dll!PyEval_RestoreThread(_ts * tstate=0x0000023251cbc260) Line 370 C python36.dll!PyGILState_Ensure() Line 855 C umath.cp36-win_amd64.pyd!00007ff8c6306ab2() Unknown umath.cp36-win_amd64.pyd!00007ff8c630723c() Unknown umath.cp36-win_amd64.pyd!00007ff8c6303a1d() Unknown umath.cp36-win_amd64.pyd!00007ff8c63077c0() Unknown umath.cp36-win_amd64.pyd!00007ff8c62ff926() Unknown [Inline Frame] python36.dll!_PyObject_FastCallDict(_object *) Line 2316 C [Inline Frame] python36.dll!_PyObject_FastCallKeywords(_object *) Line 2480 C python36.dll!call_function(_object * * * pp_stack=0x00000048be5f5e40, __int64 oparg, _object * kwnames) Line 4822 C
Numpy's extension umath calls PyGILState_Ensure(), which in turn calls PyEval_RestoreThread on the (auto) threadstate of the main interpreter. And that's wrong. We are already holding the GIL with the threadstate of our current sub-interpreter, so there's no need to switch.
I know that the GIL API is not fully compatible with sub-interpreters, as issues #10915 and #15751 illustrate.
But since I need to support calls to PyGILState_Ensure - numpy is the best example -, I am trying to improve the situation here: https://github.com/stephanreiter/cpython/commit/d9d3451b038af2820f500843b6a8...
That change may be naive, but it does the trick for my use case. If totally wrong, I don't mind pursuing another alley.
Essentially, I'd like to ask for some guidance in how to tackle this problem while keeping the current GIL API unchanged (to avoid breaking modules).
I am also wondering how I can test any changes I am proposing. Is there a test suite for interpreters, for example?
Thank you very much, Stephan _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/encukou%40gmail.com
Hi Stephan,
On Tue, Jan 22, 2019 at 9:25 AM Stephan Reiter
I am new to the list and arriving with a concrete problem that I'd like to fix myself.
That is great! Statements like that are a good way to get folks interested in your success. :)
I am embedding Python (3.6) into my C++ application and I would like to run Python scripts isolated from each other using sub-interpreters. I am not using threads; everything is supposed to run in the application's main thread.
FYI, running multiple interpreters in the same (e.g. main) thread
isn't as well thought out as running them in separate threads. There
may be assumptions in the runtime that would cause crashes or
inconsistency in the runtime, so be vigilant. Is there a reason not
to run the subinterpreters in separate threads?
Regarding isolation, keep in mind that there are some limitations. At
an intrinsic level subinterpreters are never truly isolated since they
run in the same process. This matters if you have concerns about
security (which you should always consider) and stability (if a
subinterpreter crashes then your whole process crashes). You can find
that complete isolation via subprocess & multiprocessing.
On top of intrinsic isolation, currently subinterpreters have gaps in
isolation that need fixing. For instance, they share a lot of
module-global state, as well as builtin types and singletons. So data
can leak between subinterpreters unexpectedly.
Finally, at the Python level subinterpreters don't have a good way to
pass data around. (I'm working on that. [1]) Naturally at the C
level you can keep pointers to objects and share data that way. Just
keep in mind that doing so relies on the GIL (in an
interpreter-per-thread scenario, which you're avoiding). In a world
where subinterpreters don't share the GIL [2] (and you're running one
interpreter per thread) you'll end up with refcounting races, leading
to crashes. Just keep that mind if you decide to switch to
one-subinterpreter-per-thread.
On Tue, Jan 22, 2019 at 8:09 PM Stephan Reiter
Nathaniel, I'd like to allow Python plugins in my application. A plugin should be allowed to bring its own modules along (i.e. plugin-specific subdir is in sys.path when the plugin is active) and hence some isolation of them will be needed, so that they can use different versions of a given module. That's my main motivation for using subinterpreters.
That's an interesting approach. Using subinterpreters would indeed give you isolation between the sets of imported modules. As you noticed, you'll run into some problems when extension modules are involved. There aren't any great workarounds yet . Subinterpreters are tied pretty tightly to the core runtime so it's hard to attack the problem from the outside. Furthermore, subinterpreters aren't widely used yet so folks haven't been very motivated to fix the runtime. (FWIW, that is changing.)
I thought about running plugins out-of-processes - a separate process for every plugin - and allow them to communicate with my application via RPC. But that makes it more complex to implement the API my application will offer and will slow down things due to the need to copy data.
Yep. It might be worth it though. Note that running plugins/extensions in separate processes is a fairly common approach for a variety of solid technical reasons (e.g. security, stability). FWIW, there are some tools available (or soon to be) for sharing data more efficiently (e.g. shared memory in multiprocessing, PEP 574)
Maybe you have another idea for me? :)
* single proc -- keep using subinterpreters
+ dlmopen or the Windows equivalent (I hesitate to suggest this
hack, but it might help somewhat with extension modules)
+ help fix the problems with subinterpreters :)
* single proc -- no subinterpreters
+ import hook to put plugins in their own namespace (tricky with
extension modules)
+ extend importlib to do the same
+ swap sys.modules in and out around plugin use
* multi-proc -- one process per plugin
+ subprocess
+ multiprocessing
On Wed, Jan 23, 2019 at 8:48 AM Stephan Reiter
Well, the plugins would be created by third-parties and I'd like them to enable bunding of modules with their plugins. I am afraid of modules with the same name, but being different, or different versions of modules being used by different plugins. If plugins share an interpreter, the module with a given name that is imported first sticks around forever and for all plugins.
I am thinking about this design: - Plugins don't maintain state in their Python world. They expose functions, my application calls them. - Everytime I call into them, they are presented with a clean global namespace. After the call, the namespace (dict) is thrown away. That releases any objects the plugin code has created. - So, then I could also actively unload modules they loaded. But I do know that this is problematic in particular for modules that use native code.
I am interested in both a short-term and a long-term solution. Actually, making subinterpreters work better is pretty sexy ... because it's hard. :-)
Petr noted that a number of people are working on getting subinterpreters to a good place. That includes me. [1][2] :) We'd welcome any help! -eric [1] https://www.python.org/dev/peps/pep-0554/ [2] https://github.com/ericsnowcurrently/multi-core-python
You all do make me feel very welcome in this community! Thank you very much! :-)
And thank you for all the thought and time you put into your message,
Eric. I do appreciate in particular all the alternatives you
presented; you provide a good picture of my options.
Not ruling out any of them, I'll stick with (single process + multiple
subinterpreters + plugins can't keep state in Python + all my Python
calls are performed on the main thread) for the time being. That's
quite a limited environment, which I hope I can make work in the long
run. And I think the concept of subinterpreters is nice and I'd like
to spend some time on the challenge of improving the situation.
So, I updated my changes and have the following on top of 3.6.1 at the moment:
https://github.com/stephanreiter/cpython/commit/c1afa0c8cdfab862f409f1c7ff02...
I did what Henry suggested and ran the Python test suite. On Windows,
with my changes I get as output:
357 tests OK.
2 tests failed:
test_re test_subprocess
46 tests skipped:
test_bz2 test_crypt test_curses test_dbm_gnu test_dbm_ndbm
test_devpoll test_epoll test_fcntl test_fork1 test_gdb test_grp
test_idle test_ioctl test_kqueue test_lzma test_nis test_openpty
test_ossaudiodev test_pipes test_poll test_posix test_pty test_pwd
test_readline test_resource test_smtpnet test_socketserver
test_spwd test_sqlite test_ssl test_syslog test_tcl
test_threadsignals test_timeout test_tix test_tk test_ttk_guionly
test_ttk_textonly test_turtle test_urllib2net test_urllibnet
test_wait3 test_wait4 test_winsound test_xmlrpc_net test_zipfile64
Total duration: 6 min 20 sec
Tests result: FAILURE
I dropped my changes and ran the test suite again using vanilla Python
and got the same result.
So, it seems that the change doesn't break anything that is tested,
but that probably doesn't mean a lot.
Tomorrow, I'll investigate the following situation if I find time:
If we create a fresh OS thread and make it call PyGILState_Ensure, it
won't have a PyThreadState saved under autoTLSkey. That means it will
create one using the main interpreter. I, as the developer embedding
Python into my application and using multiple interpreters, have no
control here. Maybe I know that under current conditions a certain
other interpreter should be used.
I'll try to provoke this situation and then introduce a callback from
Python into my application that will allow me to specify which
interpreter should be used, e.g. code as follows:
PyInterpreter *pickAnInterpreter() {
return activePlugin ? activePlugin->interpreter : nullptr; //
nullptr maps to main interpreter
}
PyGILState_SetNewThreadInterpreterSelectionCallback(&pickAnInterpreter);
Maybe rubbish. But I think a valuable experiment that will give me a
better understanding.
Stephan
Am Mi., 23. Jan. 2019 um 18:11 Uhr schrieb Eric Snow
Hi Stephan,
On Tue, Jan 22, 2019 at 9:25 AM Stephan Reiter
wrote: I am new to the list and arriving with a concrete problem that I'd like to fix myself.
That is great! Statements like that are a good way to get folks interested in your success. :)
I am embedding Python (3.6) into my C++ application and I would like to run Python scripts isolated from each other using sub-interpreters. I am not using threads; everything is supposed to run in the application's main thread.
FYI, running multiple interpreters in the same (e.g. main) thread isn't as well thought out as running them in separate threads. There may be assumptions in the runtime that would cause crashes or inconsistency in the runtime, so be vigilant. Is there a reason not to run the subinterpreters in separate threads?
Regarding isolation, keep in mind that there are some limitations. At an intrinsic level subinterpreters are never truly isolated since they run in the same process. This matters if you have concerns about security (which you should always consider) and stability (if a subinterpreter crashes then your whole process crashes). You can find that complete isolation via subprocess & multiprocessing.
On top of intrinsic isolation, currently subinterpreters have gaps in isolation that need fixing. For instance, they share a lot of module-global state, as well as builtin types and singletons. So data can leak between subinterpreters unexpectedly.
Finally, at the Python level subinterpreters don't have a good way to pass data around. (I'm working on that. [1]) Naturally at the C level you can keep pointers to objects and share data that way. Just keep in mind that doing so relies on the GIL (in an interpreter-per-thread scenario, which you're avoiding). In a world where subinterpreters don't share the GIL [2] (and you're running one interpreter per thread) you'll end up with refcounting races, leading to crashes. Just keep that mind if you decide to switch to one-subinterpreter-per-thread.
On Tue, Jan 22, 2019 at 8:09 PM Stephan Reiter
wrote: Nathaniel, I'd like to allow Python plugins in my application. A plugin should be allowed to bring its own modules along (i.e. plugin-specific subdir is in sys.path when the plugin is active) and hence some isolation of them will be needed, so that they can use different versions of a given module. That's my main motivation for using subinterpreters.
That's an interesting approach. Using subinterpreters would indeed give you isolation between the sets of imported modules.
As you noticed, you'll run into some problems when extension modules are involved. There aren't any great workarounds yet . Subinterpreters are tied pretty tightly to the core runtime so it's hard to attack the problem from the outside. Furthermore, subinterpreters aren't widely used yet so folks haven't been very motivated to fix the runtime. (FWIW, that is changing.)
I thought about running plugins out-of-processes - a separate process for every plugin - and allow them to communicate with my application via RPC. But that makes it more complex to implement the API my application will offer and will slow down things due to the need to copy data.
Yep. It might be worth it though. Note that running plugins/extensions in separate processes is a fairly common approach for a variety of solid technical reasons (e.g. security, stability). FWIW, there are some tools available (or soon to be) for sharing data more efficiently (e.g. shared memory in multiprocessing, PEP 574)
Maybe you have another idea for me? :)
* single proc -- keep using subinterpreters + dlmopen or the Windows equivalent (I hesitate to suggest this hack, but it might help somewhat with extension modules) + help fix the problems with subinterpreters :) * single proc -- no subinterpreters + import hook to put plugins in their own namespace (tricky with extension modules) + extend importlib to do the same + swap sys.modules in and out around plugin use * multi-proc -- one process per plugin + subprocess + multiprocessing
On Wed, Jan 23, 2019 at 8:48 AM Stephan Reiter
wrote: Well, the plugins would be created by third-parties and I'd like them to enable bunding of modules with their plugins. I am afraid of modules with the same name, but being different, or different versions of modules being used by different plugins. If plugins share an interpreter, the module with a given name that is imported first sticks around forever and for all plugins.
I am thinking about this design: - Plugins don't maintain state in their Python world. They expose functions, my application calls them. - Everytime I call into them, they are presented with a clean global namespace. After the call, the namespace (dict) is thrown away. That releases any objects the plugin code has created. - So, then I could also actively unload modules they loaded. But I do know that this is problematic in particular for modules that use native code.
I am interested in both a short-term and a long-term solution. Actually, making subinterpreters work better is pretty sexy ... because it's hard. :-)
Petr noted that a number of people are working on getting subinterpreters to a good place. That includes me. [1][2] :) We'd welcome any help!
-eric
[1] https://www.python.org/dev/peps/pep-0554/ [2] https://github.com/ericsnowcurrently/multi-core-python
If your primary concern is module clashes between plugins, maybe you can hack around that: 1) if the plugins are providing copies of any other modules, then you can simply require them to put them in their own namespace — that is, a plug-in is a single package, with however many sub modules as it may need. 2) if plugins might require third party packages that need to be isolated, then maybe you could use an import hook that re-names/isolates the modules each plugin loads, so they are kept separate. I haven’t thought through how to do any of this, but in principle, you can have the same module loaded twice if it has a different name. Not that sub interpreters aren’t cool and useful, but you can probably handle module clashes in a simpler way. -CHB Sent from my iPhone
On Jan 23, 2019, at 11:41 AM, Stephan Reiter
wrote: You all do make me feel very welcome in this community! Thank you very much! :-)
And thank you for all the thought and time you put into your message, Eric. I do appreciate in particular all the alternatives you presented; you provide a good picture of my options. Not ruling out any of them, I'll stick with (single process + multiple subinterpreters + plugins can't keep state in Python + all my Python calls are performed on the main thread) for the time being. That's quite a limited environment, which I hope I can make work in the long run. And I think the concept of subinterpreters is nice and I'd like to spend some time on the challenge of improving the situation.
So, I updated my changes and have the following on top of 3.6.1 at the moment: https://github.com/stephanreiter/cpython/commit/c1afa0c8cdfab862f409f1c7ff02...
I did what Henry suggested and ran the Python test suite. On Windows, with my changes I get as output:
357 tests OK.
2 tests failed: test_re test_subprocess
46 tests skipped: test_bz2 test_crypt test_curses test_dbm_gnu test_dbm_ndbm test_devpoll test_epoll test_fcntl test_fork1 test_gdb test_grp test_idle test_ioctl test_kqueue test_lzma test_nis test_openpty test_ossaudiodev test_pipes test_poll test_posix test_pty test_pwd test_readline test_resource test_smtpnet test_socketserver test_spwd test_sqlite test_ssl test_syslog test_tcl test_threadsignals test_timeout test_tix test_tk test_ttk_guionly test_ttk_textonly test_turtle test_urllib2net test_urllibnet test_wait3 test_wait4 test_winsound test_xmlrpc_net test_zipfile64
Total duration: 6 min 20 sec Tests result: FAILURE
I dropped my changes and ran the test suite again using vanilla Python and got the same result. So, it seems that the change doesn't break anything that is tested, but that probably doesn't mean a lot.
Tomorrow, I'll investigate the following situation if I find time:
If we create a fresh OS thread and make it call PyGILState_Ensure, it won't have a PyThreadState saved under autoTLSkey. That means it will create one using the main interpreter. I, as the developer embedding Python into my application and using multiple interpreters, have no control here. Maybe I know that under current conditions a certain other interpreter should be used.
I'll try to provoke this situation and then introduce a callback from Python into my application that will allow me to specify which interpreter should be used, e.g. code as follows:
PyInterpreter *pickAnInterpreter() { return activePlugin ? activePlugin->interpreter : nullptr; // nullptr maps to main interpreter }
PyGILState_SetNewThreadInterpreterSelectionCallback(&pickAnInterpreter);
Maybe rubbish. But I think a valuable experiment that will give me a better understanding.
Stephan
Am Mi., 23. Jan. 2019 um 18:11 Uhr schrieb Eric Snow
: Hi Stephan,
On Tue, Jan 22, 2019 at 9:25 AM Stephan Reiter
wrote: I am new to the list and arriving with a concrete problem that I'd like to fix myself. That is great! Statements like that are a good way to get folks interested in your success. :)
I am embedding Python (3.6) into my C++ application and I would like to run Python scripts isolated from each other using sub-interpreters. I am not using threads; everything is supposed to run in the application's main thread.
FYI, running multiple interpreters in the same (e.g. main) thread isn't as well thought out as running them in separate threads. There may be assumptions in the runtime that would cause crashes or inconsistency in the runtime, so be vigilant. Is there a reason not to run the subinterpreters in separate threads?
Regarding isolation, keep in mind that there are some limitations. At an intrinsic level subinterpreters are never truly isolated since they run in the same process. This matters if you have concerns about security (which you should always consider) and stability (if a subinterpreter crashes then your whole process crashes). You can find that complete isolation via subprocess & multiprocessing.
On top of intrinsic isolation, currently subinterpreters have gaps in isolation that need fixing. For instance, they share a lot of module-global state, as well as builtin types and singletons. So data can leak between subinterpreters unexpectedly.
Finally, at the Python level subinterpreters don't have a good way to pass data around. (I'm working on that. [1]) Naturally at the C level you can keep pointers to objects and share data that way. Just keep in mind that doing so relies on the GIL (in an interpreter-per-thread scenario, which you're avoiding). In a world where subinterpreters don't share the GIL [2] (and you're running one interpreter per thread) you'll end up with refcounting races, leading to crashes. Just keep that mind if you decide to switch to one-subinterpreter-per-thread.
On Tue, Jan 22, 2019 at 8:09 PM Stephan Reiter
wrote: Nathaniel, I'd like to allow Python plugins in my application. A plugin should be allowed to bring its own modules along (i.e. plugin-specific subdir is in sys.path when the plugin is active) and hence some isolation of them will be needed, so that they can use different versions of a given module. That's my main motivation for using subinterpreters. That's an interesting approach. Using subinterpreters would indeed give you isolation between the sets of imported modules.
As you noticed, you'll run into some problems when extension modules are involved. There aren't any great workarounds yet . Subinterpreters are tied pretty tightly to the core runtime so it's hard to attack the problem from the outside. Furthermore, subinterpreters aren't widely used yet so folks haven't been very motivated to fix the runtime. (FWIW, that is changing.)
I thought about running plugins out-of-processes - a separate process for every plugin - and allow them to communicate with my application via RPC. But that makes it more complex to implement the API my application will offer and will slow down things due to the need to copy data.
Yep. It might be worth it though. Note that running plugins/extensions in separate processes is a fairly common approach for a variety of solid technical reasons (e.g. security, stability). FWIW, there are some tools available (or soon to be) for sharing data more efficiently (e.g. shared memory in multiprocessing, PEP 574)
Maybe you have another idea for me? :)
* single proc -- keep using subinterpreters + dlmopen or the Windows equivalent (I hesitate to suggest this hack, but it might help somewhat with extension modules) + help fix the problems with subinterpreters :) * single proc -- no subinterpreters + import hook to put plugins in their own namespace (tricky with extension modules) + extend importlib to do the same + swap sys.modules in and out around plugin use * multi-proc -- one process per plugin + subprocess + multiprocessing
On Wed, Jan 23, 2019 at 8:48 AM Stephan Reiter
wrote: Well, the plugins would be created by third-parties and I'd like them to enable bunding of modules with their plugins. I am afraid of modules with the same name, but being different, or different versions of modules being used by different plugins. If plugins share an interpreter, the module with a given name that is imported first sticks around forever and for all plugins. I am thinking about this design: - Plugins don't maintain state in their Python world. They expose functions, my application calls them. - Everytime I call into them, they are presented with a clean global namespace. After the call, the namespace (dict) is thrown away. That releases any objects the plugin code has created. - So, then I could also actively unload modules they loaded. But I do know that this is problematic in particular for modules that use native code.
I am interested in both a short-term and a long-term solution. Actually, making subinterpreters work better is pretty sexy ... because it's hard. :-)
Petr noted that a number of people are working on getting subinterpreters to a good place. That includes me. [1][2] :) We'd welcome any help!
-eric
[1] https://www.python.org/dev/peps/pep-0554/ [2] https://github.com/ericsnowcurrently/multi-core-python
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov
On Thu, Jan 24, 2019 at 1:25 PM Chris Barker - NOAA Federal via Python-Dev < python-dev@python.org> wrote:
If your primary concern is module clashes between plugins, maybe you can hack around that:
1) if the plugins are providing copies of any other modules, then you can simply require them to put them in their own namespace — that is, a plug-in is a single package, with however many sub modules as it may need.
2) if plugins might require third party packages that need to be isolated, then maybe you could use an import hook that re-names/isolates the modules each plugin loads, so they are kept separate.
I haven’t thought through how to do any of this, but in principle, you can have the same module loaded twice if it has a different name.
This is dangerous for extension modules. C is a single global space unrelated to Python module names that cannot be isolated without intentionally building and linking each desired extension module statically and configured not to export its own symbols (no-export-dynamic). Non trivial. Suggesting importing the same extension module multiple times under different Python sys.modules names is a recipe for disaster. Most extension module code is not written with that in mind. So while *some* things happen to "work", many others blow up in unexpected hard to debug ways. Not that sub interpreters aren’t cool and useful, but you can probably
handle module clashes in a simpler way.
They're a cool and useful theory... but I really do not recommend their use for code importing other libraries expecting to be isolated. CPython doesn't offer multiple isolated runtimes in a process today. -gps
On Thu, 24 Jan 2019 at 05:45, Stephan Reiter
If we create a fresh OS thread and make it call PyGILState_Ensure, it won't have a PyThreadState saved under autoTLSkey. That means it will create one using the main interpreter. I, as the developer embedding Python into my application and using multiple interpreters, have no control here. Maybe I know that under current conditions a certain other interpreter should be used.
I'll try to provoke this situation and then introduce a callback from Python into my application that will allow me to specify which interpreter should be used, e.g. code as follows:
PyInterpreter *pickAnInterpreter() { return activePlugin ? activePlugin->interpreter : nullptr; // nullptr maps to main interpreter }
PyGILState_SetNewThreadInterpreterSelectionCallback(&pickAnInterpreter);
Maybe rubbish. But I think a valuable experiment that will give me a better understanding.
That actually sounds like a pretty plausible approach to me, at least for cases where the embedding application maintains some other state that lets it know which interpreter a new thread should be associated with. The best aspect of it is that it would let the embedding application decide how to handle registration of previously unknown threads with the Python runtime *without* requiring that all existing extension modules switch to a new thread registration API first. I'll pass the concept along to Graham Dumpleton (author of the mod_wsgi module for Apache httpd) to see if an interface like this might be enough to resolve some of the major compatibility issues mod_wsgi currently encounters with subinterpreters. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Cool. Thanks, Nick!
I did experiments based on this idea (
https://github.com/stephanreiter/cpython/commit/3bca91c26ac81e517b4aa22302be...)
and haven't rejected it yet. :-)
Together with the other fix (
https://github.com/stephanreiter/cpython/commit/c1afa0c8cdfab862f409f1c7ff02...),
numpy at least is happy in my Python-hosting app.
I will pursue the idea of swapping sys.modules in a single Interpreter now
because that wouldn't require patching Python and I might get the mileage
out of this approach I need.
Still interested in improving sub-interpreters, though. I just need to
balance short and long term solution. :-)
Stephan
Den søn. 27. jan. 2019, 15.17 skrev Nick Coghlan On Thu, 24 Jan 2019 at 05:45, Stephan Reiter If we create a fresh OS thread and make it call PyGILState_Ensure, it
won't have a PyThreadState saved under autoTLSkey. That means it will
create one using the main interpreter. I, as the developer embedding
Python into my application and using multiple interpreters, have no
control here. Maybe I know that under current conditions a certain
other interpreter should be used. I'll try to provoke this situation and then introduce a callback from
Python into my application that will allow me to specify which
interpreter should be used, e.g. code as follows: PyInterpreter *pickAnInterpreter() {
return activePlugin ? activePlugin->interpreter : nullptr; //
nullptr maps to main interpreter
} PyGILState_SetNewThreadInterpreterSelectionCallback(&pickAnInterpreter); Maybe rubbish. But I think a valuable experiment that will give me a
better understanding. That actually sounds like a pretty plausible approach to me, at least
for cases where the embedding application maintains some other state
that lets it know which interpreter a new thread should be associated
with. The best aspect of it is that it would let the embedding
application decide how to handle registration of previously unknown
threads with the Python runtime *without* requiring that all existing
extension modules switch to a new thread registration API first. I'll pass the concept along to Graham Dumpleton (author of the
mod_wsgi module for Apache httpd) to see if an interface like this
might be enough to resolve some of the major compatibility issues
mod_wsgi currently encounters with subinterpreters. Cheers,
Nick. --
Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, Jan 27, 2019, 06:34 Stephan Reiter Cool. Thanks, Nick! I did experiments based on this idea (
https://github.com/stephanreiter/cpython/commit/3bca91c26ac81e517b4aa22302be...)
and haven't rejected it yet. :-) Together with the other fix (
https://github.com/stephanreiter/cpython/commit/c1afa0c8cdfab862f409f1c7ff02...),
numpy at least is happy in my Python-hosting app. So again, just to make sure you're aware, even if it looks like it's
working right now, there are definitely many subtle ways that numpy will
break when used in a subinterpreter and this configuration is not supported
by the numpy devs. If you discover later that there's some strange crash,
or even that you've been getting incorrect results for months without
noticing, then the numpy devs will be sympathetic but will probably close
your bugs without further investigation.
-n
On Mon, 28 Jan 2019 at 00:32, Stephan Reiter
Cool. Thanks, Nick!
I did experiments based on this idea (https://github.com/stephanreiter/cpython/commit/3bca91c26ac81e517b4aa22302be...) and haven't rejected it yet. :-)
After talking to Graham about this, I unfortunately realised that the reason the callback approach is appearing to work for you is because your application is single-threaded, so you can readily map any invocation of the callback to the desired interpreter. Multi-threaded applications won't have that luxury - they need to be able to set the callback target on a per-thread basis. Graham actually described a plausible approach for doing that several years back: https://bugs.python.org/issue10915#msg126387 We have much better subinterpreter testing support now, so if this is any area that you're interested in, one potential place to start would be to get Antoine's patch back to a point where it applies and compiles again. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Reading through that post, I think I have everything covered but this here:
- The third and final scenario, and the one where the extended GIL
state functions for Ensure is still required, is where code doesn't
have the GIL as yet and wants to make a call into sub interpreter
rather than the main interpreter, where it already has a pointer to
the sub interpreter and nothing more. In this case the new
PyGILState_EnsureEx() function is used, with the sub interpreter being
passed as argument.
If I understand it correctly, it means the following in practice:
Whenever I or a third-party library start a new thread, we need to
query what interpreter we are running at the moment (in the thread
that is starting the new thread) and pass that information on to the
new thread so that it can initialize the GIL for itself.
Pseudo code ahead:
void do_in_thread(func_t *what) {
PyThreadState* state = PyThreadState_Get(); /// or new
PyInterpreterState_Current();
PyInterpreterState *interpreter = state->interp;
std::thread t([what, interpreter] {
auto s = PyGILState_EnsureEx(interpreter);
what();
PyGILState_Release(s); // could also release before what() because
TLS was updated and next PyGILState_Ensure() will work
});
}
Did I get that right?
Stephan
Am Mo., 28. Jan. 2019 um 09:27 Uhr schrieb Nick Coghlan
On Mon, 28 Jan 2019 at 00:32, Stephan Reiter
wrote: Cool. Thanks, Nick!
I did experiments based on this idea (https://github.com/stephanreiter/cpython/commit/3bca91c26ac81e517b4aa22302be...) and haven't rejected it yet. :-)
After talking to Graham about this, I unfortunately realised that the reason the callback approach is appearing to work for you is because your application is single-threaded, so you can readily map any invocation of the callback to the desired interpreter. Multi-threaded applications won't have that luxury - they need to be able to set the callback target on a per-thread basis.
Graham actually described a plausible approach for doing that several years back: https://bugs.python.org/issue10915#msg126387
We have much better subinterpreter testing support now, so if this is any area that you're interested in, one potential place to start would be to get Antoine's patch back to a point where it applies and compiles again.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Mon, 28 Jan 2019 at 19:36, Stephan Reiter
Reading through that post, I think I have everything covered but this here: - The third and final scenario, and the one where the extended GIL state functions for Ensure is still required, is where code doesn't have the GIL as yet and wants to make a call into sub interpreter rather than the main interpreter, where it already has a pointer to the sub interpreter and nothing more. In this case the new PyGILState_EnsureEx() function is used, with the sub interpreter being passed as argument.
If I understand it correctly, it means the following in practice: Whenever I or a third-party library start a new thread, we need to query what interpreter we are running at the moment (in the thread that is starting the new thread) and pass that information on to the new thread so that it can initialize the GIL for itself.
Pseudo code ahead: void do_in_thread(func_t *what) { PyThreadState* state = PyThreadState_Get(); /// or new PyInterpreterState_Current(); PyInterpreterState *interpreter = state->interp; std::thread t([what, interpreter] { auto s = PyGILState_EnsureEx(interpreter); what(); PyGILState_Release(s); // could also release before what() because TLS was updated and next PyGILState_Ensure() will work }); }
Did I get that right?
Yeah, I think that's the essence of it, although the other case that can come up is when the parent thread just created a new subinterpreter (that only changes how it acquires the pointer though - the challenge of getting a child thread to make proper use of that pointer remains the same). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (8)
-
Antoine Pitrou
-
Chris Barker - NOAA Federal
-
Eric Snow
-
Gregory P. Smith
-
Nathaniel Smith
-
Nick Coghlan
-
Petr Viktorin
-
Stephan Reiter