[IPython-dev] IPython parallel "education"

Jose Gomez-Dans jgomezdans at gmail.com
Tue Dec 23 10:04:27 EST 2014


Hi Moritz,

Many thanks for your comments, they're really helpful!

On 22 December 2014 at 20:50, Moritz Beber <moritz.beber at gmail.com> wrote:

> Hi Jose,
>
> Just wanted to share my experience with parallel:
>
>
> I've been working with up to 2 GB of data and using the push mechanism is
> not really feasible at that size. Also, the transmission time increases
> linearly (super linearly?) with more target engines. So I've tried a few
> solutions:
>
> 1.) If you're working on the same host and don't expect to expand that
> switch to multiprocessing. It's very fast in transmitting data.
>

For some work, I have used multiprocessing with shared memory. This worked
very well


> 2.) Store your data on the file system and have each engine access that.
> Either you have a shared file system for the remote kernels to access or
> you'll need to copy the data beforehand/use paramiko.
>

There's a significant overhead in using NFS in our system (access to the
disk server is our bottleneck, which are heavily used). This will probably
limit things. I was hoping there was some way to only transmit the chunks
you need to the engines that require them. Using a file store requires
copying the entire file to all slaves, maybe this will be a problem.

3.) Having a database server is quite a bit of work to invest at the
> beginning (especially if you don't know how) but really lends itself to
> this sort of task. A database server usually has a connection pool so that
> it can automatically handle many workers accessing it concurrently.
>

I'm not familiar with databases at all, and all our stuff is stored as
arrays. I was thinking about using HDF5 (well PyTables), given that it's
compressed and a fairly transparent numpy array "transport", but I was
hoping that i didn't have to go about using NFS!

Eriks's email is waaaay over my competence level on these things!!!

I'll keep experimenting and trying things, and hopefully report back (or
more likely, ask more questions)
Thanks!
Jose
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20141223/c23b7f8a/attachment.html>


More information about the IPython-dev mailing list