[Numpy-discussion] parallel compilation with numpy.distutils in numpy 1.10

Julian Taylor jtaylor.debian at googlemail.com
Fri Oct 10 14:11:22 EDT 2014


hi,
To speed up compilation of extensions I have made a PR to compile
extension files in parallel:
https://github.com/numpy/numpy/pull/5161

It adds the --jobs/-j flags to the build command of setup.py which
defines the number of parallel compile processes.
E.g.
python setup.py build --jobs 4 install --prefix /tmp/local

Additionally it adds the environment variable NPY_NUM_BUILD_JOBS which
is used if no commandline is set. This helps e.g. with pip
installations, travis builds (which give you 1.5 cpus) or to put in your
.bashrc.

The parallelization is only with the files of an extension so it is not
super efficient but an uncached numpy build goes down from 1m40s to
1m00s with 3 cores on my machine which is quite decent.
Building scipy from scratch decreased from 10minutes to 6m30s on my machine.

Unfortunately projects using cython will not profit as cython tends to
build an extension from one single file. (You may want to look into gccs
internal parallelization for that, -flto=jobs)-

Does some see issues with the interface I have currently set? Please
speak up soon.

There is still one problem in regards to parallelizing fortran 90. The
ccompiler.py contains following comment:
    # build any sources in same order as they were originally specified
    #   especially important for fortran .f90 files using modules

This indicates the f90 builds cannot be trivially parallelized. I do not
know much fortran, can someone explain to me when ordering of single
file compiles is an issue in f90?

Cheers,
Julian Taylor




More information about the NumPy-Discussion mailing list