[IPython-dev] IPython parallel "education"

Mon Dec 22 12:19:27 EST 2014

Hi Aron,

On 18 December 2014 at 20:22, Aron Ahmadia <aron at ahmadia.net> wrote:

> What happens if instead of partitioning the data, you create a list of
> work units and map those?
> Something like:
>
> def apply_the_func(i):
>       return the_func(X[N*i):X[(i+1)*N])
>
> Y = run_func.map ( [xrange(i), apply_the_func) for i in range(nodes)] )
>

This provides a substantial speed-up. I also tested other approaches
(scatter&gather), but all in all, "pushing" X to the engines seems & using
your suggestion seems to work. A question I have is what is going on behind
the scenes when I push X around: do all the engines get a copy of the full
X? In my case, X can be quite large, and it seems expensive to send lots
and lots of data to engines that will only operate on a small fraction of
the data...

Thanks for your help
Jose
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20141222/49eef8ca/attachment.html>