
On Mon, Jun 22, 2015 at 5:59 PM, Nathaniel Smith <njs@pobox.com> wrote:
On Mon, Jun 22, 2015 at 10:37 AM, Gregory P. Smith <greg@krypto.org> wrote:
...
One possibility would be for subinterpreters to copy modules from the main interpreter -- I guess your average module is mostly dicts, strings, type objects, and functions; strings and functions are already immutable and could be shared without copying, and I guess copying the dicts and type objects into the subinterpreter is much cheaper than hitting the disk etc. to do a real import. (Though certainly not free.)
Yeah, I think there are a number of mechanisms we can explore to improve the efficiency of subinterpreter startup (and sharing).
This would have interesting semantic implications -- it would give similar effects to fork(), with subinterpreters starting from a snapshot of the main interpreter's global state.
I'm not entirely sold on this overall proposal, but I think a result of it could be to make our subinterpreter support better which would be a good thing.
We have had to turn people away from subinterpreters in the past for use as part of their multithreaded C++ server where they wanted to occasionally run some Python code in embedded interpreters as part of serving some requests. Doing that would suddenly single thread their application (GIIIIIIL!) for all requests currently executing Python code despite multiple subinterpreters.
I've also talked to HPC users who discovered this problem the hard way (e.g. http://www-atlas.lbl.gov/, folks working on the Large Hadron Collider) -- they've been using Python as an extension language in some large physics codes but are now porting those bits to C++ because of the GIL issues. (In this context startup overhead should be easily amortized, but switching to an RPC model is not going to happen.)
Would this proposal make a difference for them? -eric