[IPython-dev] ZMQ Segfault - IPython cluster
MinRK
benjaminrk at gmail.com
Thu Aug 21 12:14:08 EDT 2014
On Thu, Aug 21, 2014 at 4:21 AM, Dave Hirschfeld <dave.hirschfeld at gmail.com>
wrote:
> I'm running an IPython cluster on Windows HPC x64, Python 2.7.
>
> When I start over ~80-90 engines I notice that the the controller segfaults
> with the only error message printed in the terminal being:
>
> Assertion failed: fds.size () <= FD_SETSIZE
> (bundled\zeromq\src\select.cpp:68)
>
>
> Looking on the net there's a fair few posts about this. My take on this is
> that it's a bug in libzmq - it should never just kill the host process
> which
> is exactly what the assert does (at least on windows).
>
> The below script should segfault python when N > 1023.
>
> ```
> import zmq, sys
> N = 1024
> print 'ZMQ v',zmq.zmq_version(),' N =',N
> ctx = zmq.Context()
> try:
> for _ in range(N):
> sock = ctx.socket(zmq.PAIR)
> sock.close()
> finally:
> del ctx
> ```
>
> The problem arises with the number of connections required by the IPython
> cluster which it seems can easily exceed this number.
>
> I hadn't noticed this problem previously but that could well be because I
> wasn't using so many engines. Has anyone else come across this problem
> before?
>
> With 90 engines running the Task Manager reports the controller process as
> using 1353 handles and netstat says it has 1132 open sockets:
>
> In [44]: controller_pid = '3744'
>
> In [45]: socks = !netstat -a -o -n
>
> In [46]: len(socks)
> Out[46]: 2861
>
> In [47]: len(socks.grep(controller_pid, field=-1))
> Out[47]: 1132
>
>
> Is this expected?
If so, the default (compile time) value for FD_SETSIZE
> should really be increased IMHO - especially in light of the severity
> (segfault) of the problem.
>
> I've recompiled pyzmq/libzmq manually with FD_SETSIZE=8192 and haven't had
> any further problems.
>
Yes, this is expected. The default for FD_SETSIZE is 1024 in libzmq (up
from the default of 64 on Windows), and can be increased at compile time.
As maintainer of pyzmq, I am reluctant for the defaults of pyzmq's bundled
libzmq to be different from the defaults for a traditionally built libzmq,
but zeromq-dev may be open to pushing up the default.
-MinRK
<pedant> a segfault is bad memory access; assert != segfault </pedant>
> Regards,
> Dave
>
>
> {'commit_hash': '681fd77',
> 'commit_source': 'installation',
> 'default_encoding': 'cp1252',
> 'ipython_path':
> 'C:\\Python\\envs\\quantdev\\1.4.0.post673.g1d57b50\\lib\\site-
> packages\\IPython',
> 'ipython_version': '2.1.0',
> 'os_name': 'nt',
> 'platform': 'Windows-2008ServerR2-6.1.7601-SP1',
> 'sys_executable':
> 'C:\\Python\\envs\\quantdev\\1.4.0.post673.g1d57b50\\python.exe',
> 'sys_platform': 'win32',
> 'sys_version': '2.7.8 |Continuum Analytics, Inc.| (default, Jul 2 2014,
> 15:12:11) [MSC v.1500 64 bit (AMD64)]'}
>
>
>
> _______________________________________________
> IPython-dev mailing list
> IPython-dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/ipython-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/ipython-dev/attachments/20140821/c4c5e78a/attachment-0001.html>
More information about the IPython-dev
mailing list