Would a work stealing approach work better for you here? Then the only signalling overhead would be when a core runs out of work

On Thu, 19 Aug 2021, 05:36 Stephen J. Turnbull, <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Christopher Barker writes:

 > The worker pool approach is probably the way to go, but there is a fair bit
 > of overhead to creating a multiprocessing job. So fewer, larger jobs are
 > faster than many small jobs.

True, but processing those rows would have to be awfully fast for the
increase in overhead from 16 chunks x 10^6 rows/chunk to 64 chunks x
250,000 rows/chunk to matter, and that would be plenty granular to
give a good approximation to his 2 chunks by fast core : 1 chunk by
slow core nominal goal with a single queue, multiple workers
approach.  (Of course, it almost certainly will do a lot better, since
2 : 1 was itself a very rough approximation, but the single queue
approach adjusts to speed differences automatically.)

And if it's that fast, he could do it on a single core, and still done
by the time he's finished savoring a sip of coffee. ;-)

Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/TCC7ZZLP7YMOCWSKIC2KXQQVBKT3UIMZ/
Code of Conduct: http://python.org/psf/codeofconduct/