[IPython-dev] Heartbeat Device

Brian Granger ellisonbg at gmail.com
Mon Jul 12 12:15:01 EDT 2010


On Fri, Jul 9, 2010 at 3:35 PM, MinRK <benjaminrk at gmail.com> wrote:
> Brian,
> Have you worked on the Heartbeat Device? Does that need to go in 0MQ itself,

I have not.  Ideally it could go into 0MQ itself.  But, in principle,
we could do it in pyzmq.  We just have to write a nogil pure C
function that uses the low-level C API to do the heartbeat.  Then we
can just run that function in a thread with a "with nogil" block.
Shouldn't be too bad, given how simple the heartbeat logic is.  The
main thing we will have to think about is how to start/stop the
heartbeat in a clean way.

> or can it be part of pyzmq?
> I'm trying to work out how to really tell that an engine is down.
> Is the heartbeat to be in a separate process?

No, just a separate C/C++ thread that doesn't hold the GIL.

> Are we guaranteed that a zmq thread is responsive no matter what an engine
> process is doing? If that's the case, is a moderate timeout on recv adequate
> to determine engine failure?

Yes, I think we can assume this.  The only thing that would take the
0mq thread down is something semi-fatal like a signal that doesn't get
handled.  But as long as the 0MQ thread doesn't have any bugs, it
should simply keep running no matter what the other thread does (OK,
other than segfaulting)

> If zmq threads are guaranteed to be responsive, it seems like a simple pair
> socket might be good enough, rather than needing a new device. Or even
> through the registration XREP socket.

That (registration XREP socket) won't work unless we want to write all
that logic in C.
I don't know about a PAIR socket because of the need for multiple clients?

> Can we formalize exactly what the heartbeat needs to be?

OK, let's think.  The engine needs to connect, the controller bind.
It would be nice if the controller didn't need a separate heartbeat
socket for each engine, but I guess we need the ability to track which
specific engine is heartbeating.   Also, there is the question of to
do want to do a reqest/reply or pub/sub style heartbeat.  What do you
think?

Brian


> -MinRK



-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com



More information about the IPython-dev mailing list