On Fri, Jun 12, 2020 at 7:19 AM Mark Shannon <mark@hotpy.org> wrote:
Hi Edwin,

Thanks for providing some concrete numbers.
Is it expected that creating 100 processes takes 6.3ms per process, but
that creating 1000 process takes 40ms per process? That's over 6 times
as long in the latter case.

Cheers,
Mark.

On 12/06/2020 11:29 am, Edwin Zimmerman wrote:
> On 6/12/2020 6:18 AM, Edwin Zimmerman wrote:
>> On 6/12/2020 5:08 AM, Paul Moore wrote:
>>> On Fri, 12 Jun 2020 at 09:47, Mark Shannon <mark@hotpy.org> wrote:
>>>> Starting a new process is cheap. On my machine, starting a new Python
>>>> process takes under 1ms and uses a few Mbytes.
>>> Is that on Windows or Unix? Traditionally, process creation has been
>>> costly on Windows, which is why threads, and in-process solutions in
>>> general, tend to be more common on that platform. I haven't done
>>> experiments recently, but I do tend to avoid multiprocess-type
>>> solutions on Windows "just in case". I know that evaluating a new
>>> feature based on unsubstantiated assumptions informed by "it used to
>>> be like this" is ill-advised, but so is assuming that everything will
>>> be OK based on experience on a single platform :-)
>> Here's a test on Windows 10, 4 logical cpus, 8 GB of ram:
>>
>>>>> timeit.timeit("""multiprocessing.Process(target=exit).start()""",number=100, globals=globals())
>> 0.6297528999999997
>>>>> timeit.timeit("""multiprocessing.Process(target=exit).start()""",number=1000, globals=globals())
>> 40.281721199999964
>>
>> Or this way:
>>>>> timeit.timeit("""os.system('python.exe -c "exit()"')""",number=100, globals=globals())
>> 17.461259299999995
>>
>> --Edwin
> For comparison, on a single core linux cloud server with 512 mb of ram:
>
> timeit.timeit("""multiprocessing.Process(target=exit).start()""",number=100, globals=globals())
> 0.354354709998006
>
> timeit.timeit("""multiprocessing.Process(target=exit).start()""",number=1000, globals=globals())
> 3.847851719998289
>
> So yeah, process creation is still rather costly on Windows.

I was wondering that too, some tests show that process creation/destruction starts to seriously bog down after a few hundred in a row. I guess it's hitting some resource limits it has to clean up, though creating hundreds of processes at once sounds like an antipattern that doesn't really deserve too much consideration. It would be rare that fork is more than a negligible part of any workload. (With A/V on, though, it's _much_ slower out the gate. I'm seeing over 100ms per process with Kaspersky running.)

Em