So,

It is out of scope of Pythonmultiprocessing, and, as I perceive it, from

the stdlib as a whole to be able to allocate specific cores for each subprocess -

that is automatically done by the O.S. (and of course, the O.S. having an interface

for it, one can write a specific Python library which would allow this granularity,

and it could even check core capabilities).

As it stands however, is that you simply have to change your approach:

instead of dividing yoru workload into different cores before starting, the

common approach there is to set up worker processes, one per core, or

per processor thread, and use those as a pool of resources to which

you submit your processing work in chunks.
In that way, if a worker happens to be in a faster core, it will be

done with its chunk earlier and accept more work before

slower cores are available.

If you use "concurrent.futures" or a similar approach, this pattern will happen naturally with no

specific fiddling needed on your part.

On Wed, 18 Aug 2021 at 09:19, <c.buhtz@posteo.jp> wrote:

Hello,

before posting to python-dev I thought is is the best to discuss this
here. And I assume that someone else had the same idea then me before.
Maybe you can point me to the relevant discussion/ticket.

I read about Intels hybrid CPUs. It means there are multiple cores e.g.
8 high-speed cores and 8 low-speed (but more energy efficient) cores
combined in one CPU.

In my use cases I do parallelize with Pythons multiprocessing package to
work on millions of rows on pandas.DataFrame objects. This are task that
are not vecotrizable. I simple cut the DataFrame horizontal in pieces
(numbered by the available cores).

But when the cores are different in there "speed" I need to know that.
e.g. with a 16 core CPU where half of the cores low/slow and every core
has 1 million rows to work on. The 8 high speed cores are finishing
earlier and just waiting untill the slow cores are finished. It would be
more efficient if the 8 high speed cores each would work on 1,3 million
rows and the low speed cores each on 0,7 million rows. It is not perfect
but better. I know that they will not finish all at the same timepoint.
But their end time will be closer together.

But to do this I need to know the type of the cores.

Am I wrong?

Are there any plans in the Python development taking this into account?

Kind
Christian
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/C3BYESZBZT2PNQSWCW3HGD25AGABJGOJ/
Code of Conduct: http://python.org/psf/codeofconduct/