[SciPy-Dev] [gpaw-users] wrapper for Scalapack

Sun Oct 29 15:15:32 EDT 2017

On Mon, Oct 30, 2017 at 1:14 AM, Bennet Fauber <bennet at umich.edu> wrote:

> Ralf, and all,
>
> >> There seems to be a profusion of tools for parallelization, so choosing
> >> just one to use as a basis for scipy's parallelization could be really
> >> frustrating for users who have a reason to need a different one.
> >
> > You're thinking about the relatively small fraction of power users here
> that
> > would care (compared to the n_jobs=<number> trivial parallelization
> users),
> > and my first thought is that addressing that use case comes with costs
> that
> > are possibly not worth the effort.
>
> You might consider separating that which can be done on one physical
> machine from that which requires (or expects) many.
>
> This was largely done by the R developers.  The 'snow' library used
> rsh/ssh whereas the 'multicore' library used fork() and processors.
> Steve Weston and company have the 'foreach' library that provides a
> user interface to various backends that distribute the tasks
> appropriately.  Only after many years of experience, they merged many
> functions into 'parallel' which became part of the base R.
>

Thanks, always interesting to know the history of how designs evolved in
related libraries/languages.

> It would probably be good to try to coordinate efforts at
> parallelizing within SciPy, if you choose to go that route, with those
> who are trying to get this to work better at the program level, e.g.,
> multiprocessing and ipyparallel.

Multiprocessing is exactly what I was talking about (use directly or via
joblib, which is built on top of it). Ipyparallel is squarely aimed at use
from within Jupyter, so not very relevant in the context of a library.

Actually the implementation isn't too interesting I think
(scipy.spatial.cKDTree uses threading rather than multiprocessing for
example); having an easy and uniform API is what matters.

Whatever gets done, it would be good
> to have it work well with many of the ways that people are
> implementing parallel computing.
>
> As a cluster administrator and help desk person, I would also
> encourage you to think about how this would play out in a shared
> environment that is administered not by the scientist but by some
> system administrator who may have different ideas about what can and
> cannot be done with respect to intermachine communication and using
> multiple processes (for example, is ssh blocked? are user jobs put
> into cgroups to limit cores and memory?).
>

The one thing I can think of is the design of `n_jobs=-1`, which means "use
as many cores as present". This could pick up the limit on cores if that is
defined in a standard way for a given OS. This must have come up for
scikit-learn before I'd think.

> Just a couple of thoughts from the sidelines; hopefully not too far off
> topic.
>

Not at all. Thanks.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20171030/2e411867/attachment.html>