[Numpy-discussion] Improving Python+MPI import performance
travis at continuum.io
Fri Jan 13 16:48:51 EST 2012
It is a straightforward thing to implement a "registry mechanism" for Python that by-passes imp.find_module (i.e. using sys.meta_path). You could imagine creating the registry file for a package or distribution (much like Dag described) and push that to every node during distribution.
The registry file would have the map between
package_name : file_location
which would avoid all the failed open calls. You would need to keep the registry updated as Dag describes, but this seems like a fairly simple approach that should help.
On Jan 13, 2012, at 2:38 PM, Sturla Molden wrote:
> Den 13.01.2012 21:21, skrev Dag Sverre Seljebotn:
>> Another idea: Given your diagnostics, wouldn't dumping the output of
>> "find" of every path in sys.path to a single text file work well?
> It probably would, and would also be less prone to synchronization
> problems than using an MPI broadcast. Another possibility would be to
> use a bsddb (or sqlite?) file as a persistent dict for caching the
> output of imp.find_module.
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
More information about the NumPy-Discussion