[Python-ideas] solving multi-core Python

Fri Jun 26 17:35:51 CEST 2015

On Thu, 25 Jun 2015 at 02:57 Eric Snow <ericsnowcurrently at gmail.com> wrote:

> On Wed, Jun 24, 2015 at 10:28 AM, Sturla Molden <sturla.molden at gmail.com>
> wrote:
> > The reality is that Python is used on even the largest supercomputers.
> The
> > scalability problem that is seen on those systems is not the GIL, but the
> > module import. If we have 1000 CPython processes importing modules like
> > NumPy simultaneously, they will do a "denial of service attack" on the
> file
> > system. This happens when the module importer generates a huge number of
> > failed open() calls while trying to locate the module files.
> >
> > There is even described in a paper on how to avoid this on an IBM Blue
> > Brain: "As an example, on Blue Gene P just starting up Python and
> importing
> > NumPy and GPAW with 32768 MPI tasks can take 45 minutes!"
>
> I'm curious what difference there is under Python 3.4 (or even 3.3).
> Along with being almost entirely pure Python, the import system now
> has some optimizations that help mitigate filesystem access
> (particularly stats).
>

>From the HPC setup that I use there does appear to be some difference.
The number of syscalls required to import numpy is significantly lower with
3.3 than 2.7 in our setup (I don't have 3.4 in there and I didn't compile
either of these myself):

$ strace python3.3 -c "import numpy" 2>&1 | egrep -c '(open|stat)'
1315
$ strace python2.7 -c "import numpy" 2>&1 | egrep -c '(open|stat)'
4444

It doesn't make any perceptible difference when running "time python -c
'import numpy'" on the login node. I'm not going to request 1000 cores in
order to test the difference properly. Also note that profiling in these
setups is often complicated by the other concurrent users of the system.

--
Oscar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150626/740a9d1c/attachment.html>