[Numpy-discussion] Improving Python+MPI import performance

Fri Jan 13 16:20:23 EST 2012

On 1/13/12 12:38 PM, Sturla Molden wrote:
>Den 13.01.2012 21:21, skrev Dag Sverre Seljebotn:
>> Another idea: Given your diagnostics, wouldn't dumping the output of
>> "find" of every path in sys.path to a single text file work well?
>
>It probably would, and would also be less prone to synchronization
>problems than using an MPI broadcast. Another possibility would be to
>use a bsddb (or sqlite?) file as a persistent dict for caching the
>output of imp.find_module.

We tested something along those lines. Tim Kadich, a summer student at
LLNL, wrote a module that went through the path and built up a dict of
module->location mappings for a subset of module types. My recollection is
that it worked well, and as you note, it didn't have the synchronization
issues that MPI_Import has. We didn't fully implement it, since to handle
complicated packages correctly, it looked like we'd either have to
re-implement a lot of the internal Python import code or modify the
interpreter itself. I don't think that MPI_Import is ultimately the
"right" solution, but it shows how easily we can reap significant gains.
Two better approaches that come to mind are:

1) Fixing this bottleneck at the interpreter level (pre-computing and
caching the locations)

2) More generally, dealing with this as well as other library-loading
issues at the system level, perhaps by putting a small disk near a node or
small collection of nodes, along with a command to push (broadcast) some
portions of the filesystem to these (more-)local disks. Basically, the
idea would be to let the user specify those directories or objects that
will be accessed by most of the processes and treated as read-only so that
those objects can be cached near the node.

-Asher