Sorry for the late reply, but this itch finally got to me...
Please do not claim that fork() semantics and copy-on-write are good things to build off of...
They work just fine for large classes of problems that require hundreds or thousands of cores.
They are not. fork() was designed in a world *before threads* existed.
This is wrong. While the "name" thread may not have existed when fork() was created. the *concept* of concurrent execution in a shared address space predates the creation of Unix by a good decade. Most notably, Multics - what the creators of Unix were working on before they did Unix - at least discussed the idea, though it may never have been implemented (a common fate of Multics features). Also notable is that Unix introduced the then ground-breaking idea of having the command processor create a new process to run user programs. Before Unix, user commands were run in the process (and hence address space) of the command processor. Running things in what is now called "the background" (which this architecture made a major PITA) gave you concurrent execution in a shared address space - what we today call threads. The reason those systems did this was because creating a process was *expensive*. That's also why the Multics folks looked at threads. The Unix fork/exec pair was cheap and flexible, allowing the creation of a command processor that supported easy backgrounding, pipes, and IO redirection. Fork has since gotten more expensive, in spite of the ongoing struggles to keep it cheap.
It simply can not be used reliably in a process that uses threads and tons of real world practical C and C++ software that Python programs need to interact with, be embedded in or use via extension modules these days uses threads quite effectively.
Personally, I find that threads can't be used reliably in a process that forks makes threads bad things to build off of. After all, there's tons of real world practical software in many languages that python needs to interact with that use fork effectively.
The multiprocessing module on posix would be better off if it offered a windows CreateProcess() work-a-like mode that spawns a *new* python interpreter process rather than depending on fork().
While it's a throwback to the 60s, it would make using threads and processes more convenient, but I don't need it. Why don't you submit a patch? <mike