[Python-ideas] solving multi-core Python

Wed Jun 24 07:48:00 CEST 2015

On Sun, Jun 21, 2015 at 3:08 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
> First, a minor question: instead of banning fork entirely within subinterpreters, why not just document that it is illegal to do anything between fork and exec in a subinterpreters, except for a very small (but possibly extensible) subset of Python? For example, after fork, you can no longer access any channels, and you also can't use signals, threads, fork again, imports, assignments to builtins, raising exceptions, or a whole host of other things (but of course if you exec an entirely new Python interpreter, it can do any of those things).

Sure.  I expect the quickest approach, though, will be to initially
have blanket restrictions and then ease them once the core
functionality is complete.

> C extension modules could just have a flag that marks whether the whole module is fork-safe or not (defaulting to not).

That may make sense independently from my proposal.

> So, this allows a subinterpreter to use subprocess (or even multiprocessing, as long as you use the forkserver or spawn mechanism), and it gives code that intentionally wants to do tricky/dangerous things a way to do them, but it avoids all of the problems with accidentally breaking a subinterpreter by forking it and then doing bad things.
>
> Second, a major question: In this proposal, are builtins and the modules map shared, or copied?
>
> If they're copied, it seems like it would be hard to do that even as efficiently as multiprocessing, much less more efficiently. Of course you could fake this with CoW, but I'm not sure how you'd do that, short of CoWing the entire heap (by using clone instead of pthreads on Linux, or by doing a bunch of explicit mmap and related calls on other POSIX systems), at which point you're pretty close to just implementing fork or vfork yourself to avoid calling fork or vfork, and unlikely to get it as efficient or as robust as what's already there.
>
> If they're shared, on the other hand, then it seems like it becomes very difficult to implement subinterpreter-safe code, because it's no longer safe to import a module, set a flag, call a registration function, etc.
>
>

I expect that ultimately the builtins will be shared in some fashion.
To some extent they already are.  sys.modules (and the rest of the
import machinery) will mostly not be shared, though I expect that
likewise we will have some form of sharing where we can get away with
it.

-eric