Safe to change a thread's interpreter?
data:image/s3,"s3://crabby-images/bb604/bb60413610b3b0bf9a79992058a390d70f9f4584" alt=""
Recently I've been researching implementation strategies for adding Java classloader-like capabilities to Python. I was pleasantly surprised to find out that CPython already supports multiple interpreters via the C API, where each "interpreter" includes fresh versions of 'sys', '__builtin__', etc. The C API doc for PyInterpreter_New(), however, says: """It is possible to insert objects created in one sub-interpreter into a namespace of another sub-interpreter; this should be done with great care to avoid sharing user-defined functions, methods, instances or classes between sub-interpreters, since import operations executed by such objects may affect the wrong (sub-)interpreter's dictionary of loaded modules. (XXX This is a hard-to-fix bug that will be addressed in a future release.)""" It seems to me that the bug described could be fixed (or at least worked around) by having __import__ temporarily change the 'interp' field of the current thread state to point to the interpreter that the __import__ function lives in. Then, at the end of the __import__, reset the 'interp' field back to its original value. (Of course, it would also have to fix up the linked lists of the interpreters' thread states during each swap, but that shouldn't be too difficult.) My question is: does this make sense, or am I completely out in left field here? The only thing I can think of that this would affect is the 'threading' module, in that trying to get the current thread from there (during such an import) might see a foreign interpreter's thread as its own. But, I'm hard-pressed to think of any damage that could possibly cause. Indeed, it seems to me that Python itself doesn't really care how many interpreters or thread states there are running around, and that it only has the linked lists to support "advanced debuggers". Even if it's undesirable to fix the problem this way in the Python core, would it be acceptable to do so in an extension module? What I have in mind is to create an extension module that wraps Py_InterpreterState/Py_ThreadState objects up in a subclassable extension type, designed to ensure the integrity of Python as a whole, while still allowing various import-related methods to be overridden in order to implement Java-style classloader hierarchies. So, you might do something like: from interpreter import Interpreter # Run 'somescript in its own interpreter. it = Interpreter() exit_code = it.run_main("somescript.py") # Release resources without waiting for GC it.close() My thought here also is that performing operations such as running code in a given Interpreter would also operate by swapping the thread state's 'interp' field. Thus, exceptions in the child interpreter would be seamlessly carried through to the parent interpreter. In order to implement the full Java classloader model, it would also be necessary to be able to force imports *not* to use the Interpreter that the code doing the import came from. (i.e. the equivalent of using 'java.lang.Thread.setContextClassLoader()'). This can also probably be implemented via a thread-local variable in the 'interpreter' module. So... must a thread state always reference the same interpreter object? If not, then I think I see a way to safely implement access to multiple interpreters from within Python itself.
data:image/s3,"s3://crabby-images/58a0b/58a0be886f0375938476d3eb7345a8b9d8cdc91e" alt=""
Phillip J. Eby wrote:
Recently I've been researching implementation strategies for adding Java classloader-like capabilities to Python. I was pleasantly surprised to find out that CPython already supports multiple interpreters via the C API, where each "interpreter" includes fresh versions of 'sys', '__builtin__', etc.
You should be aware that many of us consider the feature of multiple interpreters broken. For example, global variables in extension modules are shared across interpreters, and there is nothing that can be done about this, except for changing the entire C API. Regards, Martin
data:image/s3,"s3://crabby-images/bb604/bb60413610b3b0bf9a79992058a390d70f9f4584" alt=""
At 06:19 AM 8/2/04 +0200, Martin v. Löwis wrote:
Phillip J. Eby wrote:
Recently I've been researching implementation strategies for adding Java classloader-like capabilities to Python. I was pleasantly surprised to find out that CPython already supports multiple interpreters via the C API, where each "interpreter" includes fresh versions of 'sys', '__builtin__', etc.
You should be aware that many of us consider the feature of multiple interpreters broken. For example, global variables in extension modules are shared across interpreters, and there is nothing that can be done about this, except for changing the entire C API.
Yes, I saw that as a documented limitation. Are there undocumented limitations as well? Is the feature headed for deprecation? I guess I'm not understanding your implication(s), if any.
data:image/s3,"s3://crabby-images/58a0b/58a0be886f0375938476d3eb7345a8b9d8cdc91e" alt=""
Phillip J. Eby wrote:
Yes, I saw that as a documented limitation. Are there undocumented limitations as well?
Might be. However, there is one more documented the limitation: The PEP 311 extensions only work for a single interpreter state, as the PEP explains.
Is the feature headed for deprecation? I guess I'm not understanding your implication(s), if any.
That is not surprising, as the implication is not at all obvious :-) Here it is: Because the feature is essentially ill-designed, I'm not going to comment on the actual question (or on any other actual question about that feature), except for pointing out that the feature is fundamentally flawed. This is just my own policy, though. Regards, Martin
data:image/s3,"s3://crabby-images/bb604/bb60413610b3b0bf9a79992058a390d70f9f4584" alt=""
At 09:34 AM 8/2/04 +0200, Martin v. Löwis wrote:
Phillip J. Eby wrote:
Yes, I saw that as a documented limitation. Are there undocumented limitations as well?
Might be. However, there is one more documented the limitation: The PEP 311 extensions only work for a single interpreter state, as the PEP explains.
Maybe I'm misinterpreting it, but the "Design and Implementation" section sounds like the only issue would be that any automatically-allocated thread state would point to the primary interpreter. For my intended use, that's not actually a problem, since import operations would have to switch the interpreter to the "correct" one, anyway.
participants (2)
-
"Martin v. Löwis"
-
Phillip J. Eby