On 11/12/18 19:39, Stefan Behnel wrote:
Ricardo Dias schrieb am 10.12.18 um 14:42:
In the recent Cython 0.29 version was introduced a commit [1] that hinders the usage of python subinterpreters.
I discovered this the hard way when suddenly a component I was working on started to crash. The component in question is the ceph-mgr daemon from the Ceph project [2].
Python subinterpreters are the basic building block for the plugin/module architecture of ceph-mgr. Each "manager module" runs in its own python subinterpreter. Furthermore, all python bindings for the client libraries of Ceph, such as librados, librbd, libcephfs, and librgw, are implemented as Cython modules, and in the particular case of librados, all ceph-mgr plugin modules import the rados Cython module upon initialization.
In practice, with Cython 0.29 we can only load one module, because the following modules will refuse to load.
After discovering this issue, we "temporarily" prevent the issue by restricting the version of Cython as a dependency [3]. But we don't want to keep this restriction indefinitely and would prefer a fix from the Cython side.
Do you think it's feasible to implement a flag to disable the safe guard introduced in [1]? That way we could re-enable subinterpreters at our own risk.
[1] https://github.com/cython/cython/commit/7e27c7cd51a2f048cd6d3c246740cd977f8d... [2] https://github.com/ceph/ceph [3] https://github.com/ceph/ceph/pull/25328
My guess is that your modules just silently leaked object references and memory with the previous Cython versions. That is why we now inserted a guard that detects cases where the module init function is executed multiple times, which would overwrite the state of the previous run. The shared library of an extension module is only loaded once, so any global C state is shared for the entire process, regardless of how often CPython calls the module init function.
I assume that the problem with subinterpreters occurs when a cython module declares some static/global variables, which might cause undesirable side-effects upon module loading in several subinterpreters. I believe the cython modules that we develop in Ceph do not declared any global state, and therefore the modules have been working good when loaded by several subinterpreters.
I am surprised that your setup didn't crash in any way. Could you explain a bit more how you are using this feature? Are the different subinterpreters running in parallel or sequentially? The ceph repo looks huge. Any pointers where I should start looking?
The subinterpreters are run in parallel. Basically we have a single process, the ceph-mgr daemon that creates a subinterpreter per each mgr plugin (a plugin is basically a pure python module) that it finds in a specific location. All these plugins import the "rados" cython module to be able to talk with the Ceph cluster. The C++ code that manages the subinterpreters can be found at: https://github.com/ceph/ceph/tree/master/src/mgr More specifically in the files PyModule.* PyModuleRegistry.*: https://github.com/ceph/ceph/blob/master/src/mgr/PyModule.cc#L324
I actually wonder if we could at least support sequential usages through the module cleanup mechanism. Once a module is cleaned up and all global objects freed, calling the module init function again should be ok.> Apart from that, here is the feature ticket for module specific global state:
https://github.com/cython/cython/issues/2343
Stefan _______________________________________________ cython-devel mailing list cython-devel@python.org https://mail.python.org/mailman/listinfo/cython-devel
-- Ricardo Dias Senior Software Engineer - Storage Team SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)