[IPython-dev] ZMQ Segfault - IPython cluster

MinRK benjaminrk at gmail.com
Thu Aug 21 12:14:08 EDT 2014


On Thu, Aug 21, 2014 at 4:21 AM, Dave Hirschfeld <dave.hirschfeld at gmail.com>
wrote:

> I'm running an IPython cluster on Windows HPC x64, Python 2.7.
>
> When I start over ~80-90 engines I notice that the the controller segfaults
> with the only error message printed in the terminal being:
>
> Assertion failed: fds.size () <= FD_SETSIZE
> (bundled\zeromq\src\select.cpp:68)
>
>
> Looking on the net there's a fair few posts about this. My take on this is
> that it's a bug in libzmq - it should never just kill the host process
> which
> is exactly what the assert does (at least on windows).
>
> The below script should segfault python when N > 1023.
>
> ```
> import zmq, sys
> N = 1024
> print 'ZMQ v',zmq.zmq_version(),' N =',N
> ctx = zmq.Context()
> try:
>     for _ in range(N):
>         sock = ctx.socket(zmq.PAIR)
>         sock.close()
> finally:
>     del ctx
> ```
>
> The problem arises with the number of connections required by the IPython
> cluster which it seems can easily exceed this number.
>
> I hadn't noticed this problem previously but that could well be because I
> wasn't using so many engines. Has anyone else come across this problem
> before?
>
> With 90 engines running the Task Manager reports the controller process as
> using 1353 handles and netstat says it has 1132 open sockets:
>
> In [44]: controller_pid = '3744'
>
> In [45]: socks = !netstat -a -o -n
>
> In [46]: len(socks)
> Out[46]: 2861
>
> In [47]: len(socks.grep(controller_pid, field=-1))
> Out[47]: 1132
>
>
> Is this expected?

If so, the default (compile time) value for FD_SETSIZE
> should really be increased IMHO - especially in light of the severity
> (segfault) of the problem.
>
> I've recompiled pyzmq/libzmq manually with FD_SETSIZE=8192 and haven't had
> any further problems.
>


Yes, this is expected. The default for FD_SETSIZE is 1024 in libzmq (up
from the default of 64 on Windows), and can be increased at compile time.
As maintainer of pyzmq, I am reluctant for the defaults of pyzmq's bundled
libzmq to be different from the defaults for a traditionally built libzmq,
but zeromq-dev may be open to pushing up the default.

-MinRK


<pedant> a segfault is bad memory access; assert != segfault </pedant>


> Regards,
> Dave
>
>
> {'commit_hash': '681fd77',
>  'commit_source': 'installation',
>  'default_encoding': 'cp1252',
>  'ipython_path':
> 'C:\\Python\\envs\\quantdev\\1.4.0.post673.g1d57b50\\lib\\site-
> packages\\IPython',
>  'ipython_version': '2.1.0',
>  'os_name': 'nt',
>  'platform': 'Windows-2008ServerR2-6.1.7601-SP1',
>  'sys_executable':
> 'C:\\Python\\envs\\quantdev\\1.4.0.post673.g1d57b50\\python.exe',
>  'sys_platform': 'win32',
>  'sys_version': '2.7.8 |Continuum Analytics, Inc.| (default, Jul  2 2014,
> 15:12:11) [MSC v.1500 64 bit (AMD64)]'}
>
>
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140821/c4c5e78a/attachment-0001.html>


More information about the IPython-dev mailing list