
On Sat, Jun 20, 2015 at 4:16 PM, Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Jun 20, 2015 4:55 PM, "Devin Jeanpierre" <jeanpierreda@gmail.com> wrote:
It's worthwhile to consider fork as an alternative. IMO we'd get a lot out of making forking safer, easier, and more efficient. (e.g. respectively: adding an atfork registration mechanism; separating out the bits of multiprocessing that use pickle from those that d, I still disagreeon't; moving the refcount to a separate page, or allowing it to be frozen prior to a fork.)
So leverage a common base of code with the multiprocessing module?
What is this question in response to? I don't understand.
I would expect subinterpreters to use less memory. Furthermore creating them would be significantly faster. Passing objects between them would be much more efficient. And, yes, cross-platform.
Maybe I don't understand how subinterpreters work. AIUI, the whole point of independent subinterpreters is that they share no state. So if I have a web server, each independent serving thread has to do all of the initialization (import HTTP libraries, etc.), right? Compare with forking, where the initialization is all done and then you fork, and you are immediately ready to serve, using the data structures shared with all the other workers, which is only copied when it is written to. So forking starts up faster and uses less memory (due to shared memory.) Re passing objects, see below. I do agree it's cross-platform, but right now that's the only thing I agree with.
Note: I don't count the IPC cost of forking, because at least on linux, any way to efficiently share objects between independent interpreters in separate threads can also be ported to independent interpreters in forked subprocesses,
How so? Subinterpreters are in the same process. For this proposal each would be on its own thread. Sharing objects between them through channels would be more efficient than IPC. Perhaps I've missed something?
You might be missing that memory can be shared between processes, not just threads, but I don't know. The reason passing objects between processes is so slow is currently *nearly entirely* the cost of serialization. That is, it's the fact that you are passing an object to an entirely separate interpreter, and need to serialize the whole object graph and so on. If you can make that fast without serialization, for shared memory threads, then all the serialization becomes unnecessary, and you can either write to a pipe (fast, if it's a non-container), or used shared memory from the beginning (instantaneous). This is possible on any POSIX OS. Linux lets you go even further. -- Devin