[SciPy-Dev] [gpaw-users] wrapper for Scalapack
marc
marc.barbry at mailoo.org
Mon Oct 30 04:25:11 EDT 2017
Dear all,
I'm glad to see that the Scipy community look interested by this topic.
As I pointed out with the first mail of this thread, we started an
implementation of ScaLapack wrapper, you can find it at the following
repository,
https://gitlab.com/mbarbry/python-scalapack
We try to use the same method that have been use in Scipy for the
Blas/Lapack wrapper (using f2py).
The wrapper is already functional, but the installation is not as
straightforward that I would hope and only very few routines are
implemented for the moment.
So any help is more than welcome.
Best regards,
Marc
On 10/29/2017 08:15 PM, Ralf Gommers wrote:
>
>
> On Mon, Oct 30, 2017 at 1:14 AM, Bennet Fauber <bennet at umich.edu
> <mailto:bennet at umich.edu>> wrote:
>
> Ralf, and all,
>
> >> There seems to be a profusion of tools for parallelization, so
> choosing
> >> just one to use as a basis for scipy's parallelization could be
> really
> >> frustrating for users who have a reason to need a different one.
> >
> > You're thinking about the relatively small fraction of power
> users here that
> > would care (compared to the n_jobs=<number> trivial
> parallelization users),
> > and my first thought is that addressing that use case comes with
> costs that
> > are possibly not worth the effort.
>
> You might consider separating that which can be done on one physical
> machine from that which requires (or expects) many.
>
> This was largely done by the R developers. The 'snow' library used
> rsh/ssh whereas the 'multicore' library used fork() and processors.
> Steve Weston and company have the 'foreach' library that provides a
> user interface to various backends that distribute the tasks
> appropriately. Only after many years of experience, they merged many
> functions into 'parallel' which became part of the base R.
>
>
> Thanks, always interesting to know the history of how designs evolved
> in related libraries/languages.
>
>
> It would probably be good to try to coordinate efforts at
> parallelizing within SciPy, if you choose to go that route, with those
> who are trying to get this to work better at the program level, e.g.,
> multiprocessing and ipyparallel.
>
>
> Multiprocessing is exactly what I was talking about (use directly or
> via joblib, which is built on top of it). Ipyparallel is squarely
> aimed at use from within Jupyter, so not very relevant in the context
> of a library.
>
> Actually the implementation isn't too interesting I think
> (scipy.spatial.cKDTree uses threading rather than multiprocessing for
> example); having an easy and uniform API is what matters.
>
> Whatever gets done, it would be good
> to have it work well with many of the ways that people are
> implementing parallel computing.
>
> As a cluster administrator and help desk person, I would also
> encourage you to think about how this would play out in a shared
> environment that is administered not by the scientist but by some
> system administrator who may have different ideas about what can and
> cannot be done with respect to intermachine communication and using
> multiple processes (for example, is ssh blocked? are user jobs put
> into cgroups to limit cores and memory?).
>
>
> The one thing I can think of is the design of `n_jobs=-1`, which means
> "use as many cores as present". This could pick up the limit on cores
> if that is defined in a standard way for a given OS. This must have
> come up for scikit-learn before I'd think.
>
>
> Just a couple of thoughts from the sidelines; hopefully not too
> far off topic.
>
>
> Not at all. Thanks.
>
> Ralf
>
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at python.org
> https://mail.python.org/mailman/listinfo/scipy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20171030/1d7ff1d3/attachment.html>
More information about the SciPy-Dev
mailing list