[IPython-dev] Qt/Curses interfaces future: results of the weekend mini-sprint (or having fun with 0mq)

Mikhail Terekhov termim at gmail.com
Thu Mar 25 14:16:12 EDT 2010


Brian,

We have considered everything :).  The story of how we have arrived at
> 0MQ is pretty interesting and worth recording.  We have had
> implementations based on XML-RPC, Twisted (numerous protocols, HTTP,
> PB, Foolscap) and raw sockets. I have played with earlier versions of
> RPyC as well.
>
> There are a couple of issue we keep running into with *every* solution
> we have tried (except for 0MQ):
>
> * The GIL kills.  Because IPython is designed to execute arbitrary
> user code, and our users often run wrapped C/C++ libraries, it is not
> uncommon for non-GIL releasing code to be run in IPython.  When this
> happens, any Python thread *completely stops*.  When you are building
> a robust distributed systems, you simply can't have this.  As far as I
> know all Python based networking and RPC libraries suffer from this
> same exact issue.  Note: it is not enough that the underlying socket
> send/recv happen with the GIL released.
>
> That sounds intriguing! How 0MQ is different in this regard, does it
maintain its own threads inside independent of GIL?


> * Performance. We need network protocols that have near ping latencies
> but can also easily handle many MB - GB sized messages at the same
> time.  Prior to 0MQ I have not seen a network protocols that can do
> both.  Our experiments with 0MQ have been shocking.  We see near ping
> latencies for small messages and can send massive messages without
> even thinking about it.  All of this is while CPU and memory usage is
> minimal.

It sounds you've found a silver bullet :)
BTW I use twisted for client/server communication in my projects these days
and while I never had a need to transfer GB sized messages back and forth,
I've never had any issues with latencies either, except for the delays
immanent
to some particular network.

One of the difficulties that networking libraries in Python
> face (at least currently) is that they all use strings for network
> buffers.  The problem with this is that you end up copying them all
> over the place.  With Twisted, we have to go to incredible lengths to
> avoid this.  Is the situation different with RPyC?
>
> Yes string type is an old workhorse in python. I don't know internals of
RPyC
but I suspect it uses strings extensively as well. What pyzmq uses instead
of
strings?


> * Messaging not RPC.  As we have developed a distributed architecture
> that is more and more complex, we have realized something quite
> significant: we are not really doing RPC, we are sending messages in
> various patterns and 0MQ encodes these patterns extremely well.
> Examples are request/reply and pub/sub, but other more complex
> messaging patterns are possible as well - and we need those. In my
> mind, the key difference between RPC is the presence of message queues
> in an architecture.  Multiprocessing has some of this actually, but I
> haven't looked at what they are doing underneath the hood.  I
> encourage you to look at the example Fernando described.  It really
> shows in significant ways that we are not doing RPC.
>
> Frankly I think the difference between messaging and RPC is mostly a
terminological one. A message queues presence really just means that the
system provides asynchronous services and many RPC frameworks
provide that. (For some digression: In OO design world they even say
"send a message to the object" instead of "call an object's method"
sometimes. Wieird geeks :))

> The reason is that IPython already has a lot of useful and exciting
> > functionality and yet another RPC framework is somewhat too much. Plus,
> > you don't have to think about these too low level details like
> communication
> > protocols, serialization etc.
>
> 0MQ is definitely not another RPC framework.  If you know that RPyC
> addresses some or all of these issue I have brought up above, i would
> seriously love to know.  One of these days, I will probably try to do
> some benchmarks that compare twisted, multiprocessing, RPyC and 0MQ
> for things like latency and throughput.  That would be quite
> interesting.
>
> Yes, 0MQ is not an RPC framework - it is just a low level protocol (albeit
probably a good one) that you will use to build your own RPC/RMI/messaging
system. Frankly I do not see 0MQ to be immune to all the issues you've
brought
up above unless you'll drop python and code everything in C/C++. In my
experience latencies and and performance bottlenecks usually came from the
code that serves messages (i.e. server part) not the transport layer, unless
you
develop some high load server with thousands messages per second which is
not the case for IPython I believe. Or the network itself could be just
slow, but
in this case no library could help unfortunately. But of course I can easily
miss
something obvious.

Please do not think that I'm tying to bash the pyzmq idea, not at all! I
think it is a
great idea for IPython and it will be a real fun to implement. I'm just
trying to
understand what is so different in IPython that any other RPC/RMI/messaging
framework can't fit? RPyC along side with Pyro was just the first one that
came
to mind when I read Fernando's post but there are a lot of them, see for
example
python's wiki for a list: http://wiki.python.org/moin/ParallelProcessing.
I personally have successfully used another toolkit not mentioned on the
above
page  - http://www.spread.org - it is a group communication toolkit that
provides
guarantied message delivery and so called virtual synchrony.

I think that when the first excitement ends and you will start to develop
this new
interface, you will end up implementing all this functionality that other
RPC
frameworks have or the most of it, so it would be useful to at least check
them
before implementation.

Another important part of 0MQ is that is runs over protocols other
> than tcp and interconnects like infiniband.  The performance on
> infiniband is quite impressive.
>
> Cool! Any Idea how to utilize it in python/IPython?

Regards,
-- 
Mikhail Terekhov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20100325/efb446e0/attachment.html>


More information about the IPython-dev mailing list