[Python-ideas] solving multi-core Python

Thu Jun 25 10:58:25 CEST 2015

On Wed, Jun 24, 2015 at 9:59 PM, Trent Nelson <trent at snakebite.org> wrote:
> (Sturla commented on the "import-DDoS" that you can run into on POSIX
>  systems, which is a good example.  You're saturating your underlying
>  hardware, sure, but you're not doing useful work -- it's important
>  to distinguish the two.)

To be clear, AFAIU the "import-DDoS" that supercomputers classically
run into has nothing to do with POSIX, it has to do running systems
that were designed for simulation workloads that go like: generate a
bunch of data from scratch in memory, crunch on it for a while, and
then spit out some summaries. So you end up with $1e11 spent on
increasing the FLOP count, and the absolute minimum spent on the
storage system -- basically just enough to let you load a single
static binary into memory at the start of your computation, and there
might even be some specific hacks in the linker to minimize cost of
distributing that single binary load. (These are really weird
architectures; they usually do not even have shared library support.)
And the result is that when you try spinning up a Python program
instead, the startup sequence produces (number of imports) * (number
of entries in sys.path) * (hundreds of thousands of nodes)
simultaneous stat calls hammering some poor NFS server somewhere and
it falls over and dies. (I think often the network connection to the
NFS server is not even using the ridiculously-fast interconnect mesh,
but rather some plain-old-ethernet that gets saturated.) I could be
wrong, I don't actually work with these systems myself, but that's
what I've picked up.

Continuing my vague and uninformed impressions, I suspect that this
would actually be relatively easy to fix by hooking the import system
to do something more intelligent, like nominate one node as the leader
and have it do the file lookups and then tell everyone else what it
found (via the existing message-passing systems). Though there is an
interesting problem of how you bootstrap the hook code.

But as to whether the new import hook stuff actually helps with
this... I'm pretty sure most HPC centers haven't noticed that Python 3
exists yet. See above re: extremely weird architectures -- many of us
are familiar with "clinging to RHEL 5" levels of conservatism, but
that's nothing on "look there's only one person who ever knew how to
get a working python and numpy using our bespoke compiler toolchain on
this architecture that doesn't support extension module loading (!!),
and they haven't touched it in years either"...

There are lots of smart people working on this stuff right now. But
they are starting from a pretty different place from those of us in
the consumer computing world :-).

-n

-- 
Nathaniel J. Smith -- http://vorpus.org