[IPython-dev] pyzmq problems in sending shell messages to a kernel

MinRK benjaminrk at gmail.com
Wed Feb 12 14:01:08 EST 2014


Do you perchance customize KernelManagers?  Specifically, do you change how
sockets are created or what socket types are used?

-MinRK


On Wed, Feb 12, 2014 at 10:54 AM, MinRK <benjaminrk at gmail.com> wrote:

> Nothing springs to mind, but I will think about this one for a while. Are
> there likely other outstanding requests to the same kernel at the time,
> and/or from the same requesting socket? If so, can you ballpark how many?
> What are you using to indicate that pyzmq thinks the message has been sent?
>
> -MinRK
>
>
> On Wed, Feb 12, 2014 at 10:45 AM, Jason Grout <jason-sage at creativetrax.com
> > wrote:
>
>> Hi everyone,
>>
>> I'm trying to track down a problem we're seeing in the Sage cell server
>> with sending computation messages to an IPython kernel.  This may end up
>> being a problem with using pyzmq or zmq, so apologies in advance if it
>> turns out to be OT for this list.
>>
>> The tl;dr version is: it appears that in some very sporadic cases, pyzmq
>> is sending a message (an execute_request message) to a kernel's shell
>> channel tcp port on localhost, but wireshark never registers that
>> message being sent, and the kernel that is supposed to receive the
>> message never acts on it.  My question is: does anyone have suggestions
>> on debugging this or narrowing down the problem?
>>
>> The (abbreviated, simplified) long version: in the sage cell server, we
>> start up a number of IPython kernels that we keep waiting around for
>> computations.  When a computation is requested, we hook up the kernel's
>> shell/iopub/heartbeat channels (i.e., create pyzmq zmqstream objects
>> connecting to the tcp ports corresponding to the kernel's
>> shell/io/heartbeat channels), send an execute_request, and assemble an
>> answer for the user from output coming back on the iopub channel.  When
>> the system is under moderate load, every now and then (maybe every 300
>> computations), we send an execute_request message to one of these
>> kernels that is waiting around, and I see the zmq socket code claiming
>> that it sent the message, but wireshark indicates that the message was
>> never transmitted when looking at raw tcp traffic, and the kernel acts
>> like it never received the message.  We didn't change the high water
>> mark for zmq, and I'm running zmq 3.2.2 and pyzmq 14.0.1.  I've spent a
>> long time narrowing the issue down to a zmq message not being sent, even
>> though pyzmq seems to have thought it sent it.  Does anyone have any
>> suggestions for narrowing this down more, or possible causes?
>>
>> I realize that my setup is a bit complicated, and I've tried to simplify
>> the issues (but hopefully not too much).  Any suggestions or help would
>> be appreciated.  The next thing I'm going to do is (a) upgrade zmq to
>> 4.x, and (b) insert some debugging statements in the zmq library itself
>> to see if the C zmq library thinks it sent the message.
>>
>> Thanks,
>>
>> Jason
>> _______________________________________________
>> IPython-dev mailing list
>> IPython-dev at scipy.org
>> http://mail.scipy.org/mailman/listinfo/ipython-dev
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140212/09b517d1/attachment.html>


More information about the IPython-dev mailing list