More CPUs doen't equal more speed
Christian Gollwitzer
auriocus at gmx.de
Fri May 24 03:02:33 EDT 2019
Am 23.05.19 um 23:44 schrieb Paul Rubin:
> Bob van der Poel <bob at mellowood.ca> writes:
>> for i in range(0, len(filelist), CPU_COUNT):
>> for z in range(i, i+CPU_COUNT):
>> doit( filelist[z])
>
> Write your program to just process one file, then use GNU Parallel
> to run the program on your 1200 files, 6 at a time.
>
This is a very sensible suggestion. GNU parallel on a list of files is
relatively easy, for instance I use it to resize many images in parallel
like this:
parallel convert {} -resize 1600 small_{} ::: *.JPG
The {} is replaced for each file in turn.
Another way with an external tool is a Makefile. GNU make can run in
parallel, by setting the flag "-j", so "make -j6" will run 6 processes i
parallel. It is more work to set up the Makefile, but it might pay off
if you have a dependency graph or if the process is interrupted.
"make" can figure out which files need to be processed and therefore
continue a stopped job.
Maybe rewriting all of this from scratch in Python is not worth it.
Christian
More information about the Python-list
mailing list