[pypy-dev] Dead loop occurs when using python-daemon and multiprocessing together in PyPy 4.0.1

hubo hubo at jiedaibao.com
Wed Dec 23 08:36:09 EST 2015


No, the python-daemon module is critical in this problem, because it is the python-daemon module who closed the fd to /dev/urandom. When process swith to daemon, it forks itself, and then close all open fds (including stdin, stdout and stderr), so it also closes the fd for /dev/urandom which is used by PyPy library. It is the standard behavior defined by https://www.python.org/dev/peps/pep-3143/#daemoncontext-objects and also the standard behavior for unix daemons. And unfortunately there is not a way to prevent the fd to be closed without knowing exactly what number it is on.

Without python-daemon (or similar libraries), it is only possible to reproduce the problem by closing the fd (usually 4) forcely, but it does not make much sense.


2015-12-23 

hubo 



发件人:Maciej Fijalkowski <fijall at gmail.com>
发送时间:2015-12-23 21:22
主题:Re: Re: Re: [pypy-dev] Dead loop occurs when using python-daemon and multiprocessing together in PyPy 4.0.1
收件人:"hubo"<hubo at jiedaibao.com>
抄送:"pypy-dev"<pypy-dev at python.org>

can you reproduce the OSError problem without having the daemon module involved either?


On Wed, Dec 23, 2015 at 3:14 PM, hubo <hubo at jiedaibao.com> wrote:

I can only reproduce the OSError problem. Maybe the CPU 100% is not really a dead lock, but rather some kind of automatic crash report? Although it is quite easy to crash the program with os.urandom, it only stops responding when the crash happens in system libraries like multiprocessing or email.

The posix.urandom problem is quite easy to reproduce:

#!/usr/bin/pypy
import os
os.urandom(16)
def test():
    print repr(os.urandom(16))
import daemon
import sys
if __name__ == '__main__':
    with daemon.DaemonContext(initgroups=False, stderr=sys.stderr,stdout=sys.stdout):
        test()

(stderr and stdout is kept open to show console messages in the daemon. initgroups=False is a workaround on python-daemon not working in Python2.6)

Or, with module random:

#!/usr/bin/pypy
import random
def test():
    random.Random()
import daemon
import sys
if __name__ == '__main__':
    with daemon.DaemonContext(initgroups=False, stderr=sys.stderr,stdout=sys.stdout):
        test()

And when run scripts with pypy:

pypy test3.py

it crashes with OSError:
Traceback (most recent call last):
  File "test2.py", line 13, in <module>
    test()
  File "test2.py", line 6, in test
    random.Random()
  File "/opt/pypy-4.0.1-linux_x86_64-portable/lib-python/2.7/random.py", line 95, in __init__
    self.seed(x)
  File "/opt/pypy-4.0.1-linux_x86_64-portable/lib-python/2.7/random.py", line 111, in seed
    a = long(_hexlify(_urandom(2500)), 16)
OSError: [Errno 9] Bad file descriptor

It is still not clear why it causes dead loop (or long-time no responding) in multiprocessing (should have thrown an ImportError) and the exact condition for the file descriptor of /dev/urandom appears (just call os.urandom and import random does not reproduce the result), but I believe it is definitely linked to the problem.

2015-12-23 

hubo 



发件人:Maciej Fijalkowski <fijall at gmail.com>
发送时间:2015-12-23 20:07
主题:Re: Re: [pypy-dev] Dead loop occurs when using python-daemon and multiprocessing together in PyPy 4.0.1
收件人:"hubo"<hubo at jiedaibao.com>
抄送:"pypy-dev"<pypy-dev at python.org>

That's very interesting, can you produce a standalone example that does not use multiprocessing? That would make it much easier to fix the bug (e.g. os.fork followed by os.urandom failing)


On Wed, Dec 23, 2015 at 1:54 PM, hubo <hubo at jiedaibao.com> wrote:

Thanks for the response. Should I put it directly in the bug tracker?

FYI, I've located the reason to be the incompatibility with python-daemon (or rather the standard unix-daemon behavior) and PyPy posix.urandom implementation. 

It seems that in PyPy 4.0.1, when module random loaded, a file descriptor is created on /dev/urandom. I think PyPy implementation use the shared descriptor to read from /dev/urandom. Sadly when python-daemon fork the process and turns it into an unix daemon, it closes all the currently open file descriptors. After that all os.urandom calls failed with OSError. I think maybe the other functions of Random class is also using the file descriptor in C code and just never detects if the return value is 0, and causes the dead loop.

I think the problem will be solved if the implementation re-open the handle when it is closed somehow.

multiprocessing is using random internally. Also there are lots of other modules using random, like email etc. The dead loop occurs when you use any of the libraries in a daemon.



2015-12-23 

hubo 



发件人:Maciej Fijalkowski <fijall at gmail.com>
发送时间:2015-12-23 19:35
主题:Re: [pypy-dev] Dead loop occurs when using python-daemon and multiprocessing together in PyPy 4.0.1
收件人:"hubo"<hubo at jiedaibao.com>
抄送:"pypy-dev"<pypy-dev at python.org>

Hi hubo 


Can you put it as a bug report? Those things get easily lost on the mailing list (and sadly I won't look at it right now, multiprocessing scares me)


On Wed, Dec 23, 2015 at 12:03 PM, hubo <hubo at jiedaibao.com> wrote:

Hello devs,

A (possible) dead loop is found when I use python-daemon and multiprocessing together in PyPy 4.0.1, which does not appear in Python(2.6 or 2.7). Also it does not appear in earlier PyPy versions (2.0.2)

Reproduce:

First install python-daemon:
pypy_pip install python-daemon

Use the following test script (also available in attachment):

#!/usr/bin/pypy
import daemon
import multiprocessing
def test():
    q = multiprocessing.Queue(64)
if __name__ == '__main__':
    with daemon.DaemonContext():
        test()

When executing the script with pypy:
pypy test.py

The background service does not exit, and is consuming 100% CPU:
ps aux | grep pypy
root      7769 99.1  0.5 235332 46812 ?        R    17:52   2:09 pypy test.py
root      7775  0.0  0.0 103252   804 pts/1    S+   17:54   0:00 grep pypy




Executing the script with python:
python2.7 test.py
And the background service normally exits.

Environment:
I'm using CentOS 6.5, with portable PyPy distribution for linux (https://bitbucket.org/squeaky/portable-pypy/downloads/pypy-4.0.1-linux_x86_64-portable.tar.bz2)
I run the script on system built-in python (python 2.6.6), a compiled CPython (2.7.11), and pypy from epel-release(pypy 2.0.2, python 2.7.2), and the problem does not appear. Though the compiled CPython is 2.7.11 and PyPy 4.0.4 is python 2.7.10, I think that does not matter much.

Please contact if you have any questions or ideas.


2015-12-23


hubo 

_______________________________________________
pypy-dev mailing list
pypy-dev at python.org
https://mail.python.org/mailman/listinfo/pypy-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20151223/79504124/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Clip(12-23-17-55-26)(2)(1)(1).png
Type: image/png
Size: 2516 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/pypy-dev/attachments/20151223/79504124/attachment-0001.png>


More information about the pypy-dev mailing list