[IPython-dev] ZMQ Parallel IPython Performance preview

Brian Granger ellisonbg at gmail.com
Fri Oct 22 12:26:20 EDT 2010


This is absolutely incredible.  We were hoping it would be better, but this
is even better than I had imagined (partially because the Twisted
performance is even worse than I thought).  In particular the stability of
all the zmq based queues is really impressive.  In my mind, this means that
to first order, we really are hitting the latency/throughput of the loopback
interface and that this limitation is stable under load.  This is really
significant, because it means the performance in a real cluster could be
improved by using a fast interconnect.  Also, this brings our task
granularity down to ~ 1 ms, which is a big deal.  This *really* sets us
apart from the traditional load balancing that a batch system does.  Can you
rerun this test keeping the CPU effort at 0, but sending a large buffer with
each task.  Then vary the size of that buffer (512, 1024, ...).  I want to
see how zmq scales in the throughput sense.  Twisted is especially horrible
in this respect.

I am also quite impressed at how little we loose in moving to Python for the
actual scheduling.  That is still very good performance, especially taking
into account the dependency handling that is going on.  I have some ideas
about performance testing that that will make a very good story for a paper.

This is really great and we definitely need to start reviewing this soon.



On Thu, Oct 21, 2010 at 11:53 PM, MinRK <benjaminrk at gmail.com> wrote:

> I have my first performance numbers for throughput with the new parallel
> code riding on ZeroMQ, and results are fairly promising.  Roundtrip time for
> ~512 tiny tasks submitted as fast as they can is ~100x faster than with
> Twisted.
> As a throughput test, I submitted a flood of many very small tasks that
> should take ~no time:
> new-style:
> def wait(t=0):
>     import time
>     time.sleep(t)
> submit:
> client.apply(wait, args=(t,))
> Twisted:
> task = StringTask("import time; time.sleep(%f)"%t)
> submit:
> client.run(task)
> Flooding the queue with these tasks with t=0, and then waiting for the
> results, I tracked two times:
> Sent: the time from the first submit until the last submit returns
> Roundtrip: the time from the first submit to getting the last result
> Plotting these times vs number of messages, we see some decent numbers:
> * The pure ZMQ scheduler is fastest, 10-100 times faster than Twisted
> roundtrip
> * The Python scheduler is ~3x slower roundtrip than pure ZMQ, but no
> penalty to the submission rate
> * Twisted performance falls off very quickly as the number of tasks grows
> * ZMQ performance is quite flat
> Legend:
> zmq: the pure ZMQ Device is used for routing tasks
> lru/weighted: the simplest/most complicated routing schemes respectively in
> the Python ZMQ Scheduler (which supports dependencies)
> twisted: the old IPython.kernel
> [image: roundtrip.png]
> [image: sent.png]
> Test system:
> Core-i7 930, 4x2 cores (ht), 4-engine cluster all over tcp/loopback, Ubuntu
> 10.04, Python 2.6.5
> -MinRK
> http://github.com/minrk

Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20101022/b4c4fa9e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sent.png
Type: image/png
Size: 31114 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20101022/b4c4fa9e/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: roundtrip.png
Type: image/png
Size: 30731 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20101022/b4c4fa9e/attachment-0001.png>

More information about the IPython-dev mailing list