More CPUs doen't equal more speed

Chris Angelico rosuav at gmail.com
Sun May 26 14:11:22 EDT 2019


On Mon, May 27, 2019 at 4:06 AM Grant Edwards <grant.b.edwards at gmail.com> wrote:
>
> On 2019-05-23, Chris Angelico <rosuav at gmail.com> wrote:
> > On Fri, May 24, 2019 at 5:37 AM Bob van der Poel <bob at mellowood.ca> wrote:
> >>
> >> I've got a short script that loops though a number of files and
> >> processes them one at a time. I had a bit of time today and figured
> >> I'd rewrite the script to process the files 4 at a time by using 4
> >> different instances of python. My basic loop is:
> >>
> >> for i in range(0, len(filelist), CPU_COUNT):
> >>     for z in range(i, i+CPU_COUNT):
> >>         doit( filelist[z])
> >>
> >> With the function doit() calling up the program to do the
> >> lifting. Setting CPU_COUNT to 1 or 5 (I have 6 cores) makes no
> >> difference in total speed.  I'm processing about 1200 files and my
> >> total duration is around 2 minutes.  No matter how many cores I use
> >> the total is within a 5 second range.
> >
> > Where's the part of the code that actually runs them across multiple
> > CPUs? Also, are you spending your time waiting on the disk, the CPU,
> > IPC, or something else?
>
> He said he's using N differenct Python instances, and he even provided
> the code that runs in each instance which is obviously processesing
> 1/Nth of the files.
>
> It's a pretty good bet that I/O is the limiting factor.
>

Sometimes, the "simple" and "obvious" code, the part that clearly has
no bugs in it, is the part that has the problem. :)

ChrisA


More information about the Python-list mailing list