[issue6362] multiprocessing: handling of errno after signals in sem_acquire()

Ryan Leslie report at bugs.python.org
Mon Jun 29 02:55:54 CEST 2009


New submission from Ryan Leslie <rylesny at gmail.com>:

While developing an application, an inconsistency was noted where,
depending on the particular signal handler in use,
multiprocessing.Queue.put() may (or may not) raise OSError() after
sys.exit() was called by the handler. The following example, which was
tested with Python 2.6.1 on Linux, demonstrates this.

#!/usr/bin/env python

import multiprocessing
import signal
import sys

def handleKill(signum, frame):
   #sys.stdout.write("Exit requested by signal.\n")
   print "Exit requested by signal."
   sys.exit(1)
signal.signal(signal.SIGTERM, handleKill)

queue = multiprocessing.Queue(maxsize=1)
queue.put(None)
queue.put(None)

When the script is run, the process will block (as expected) on the
second queue.put(). If (from another terminal) I send the process
SIGTERM, I consistently see:

$ ./q.py
Exit requested by signal.
$

Now, if I modify the above program by commenting out the 'print', and
uncommenting the 'sys.stdout' (a very subtle change), I would expect
the result to be the same when killing the process. Instead, I
consistently see:

$ ./q.py
Exit requested by signal.
Traceback (most recent call last):
 File "./q.py", line 15, in <module>
   queue.put(None)
 File "python2.6/multiprocessing/queues.py", line 75, in put
   if not self._sem.acquire(block, timeout):
OSError: [Errno 0] Error
$ 

After debugging this further, the issue appears to be in
semlock_acquire() or semaphore.c in Modules/_multiprocessing:
http://svn.python.org/view/python/trunk/Modules/_multiprocessing/semaphore.c?revision=71009&view=markup

The relevant code from (the Unix version of) semlock_acquire() is:

do {
               Py_BEGIN_ALLOW_THREADS
               if (blocking && timeout_obj == Py_None)
                       res = sem_wait(self->handle);
               else if (!blocking)
                       res = sem_trywait(self->handle);
               else
                       res = sem_timedwait(self->handle, &deadline);
               Py_END_ALLOW_THREADS
               if (res == MP_EXCEPTION_HAS_BEEN_SET)
                       break;
       } while (res < 0 && errno == EINTR && !PyErr_CheckSignals());

       if (res < 0) {
               if (errno == EAGAIN || errno == ETIMEDOUT)
                       Py_RETURN_FALSE;
               else if (errno == EINTR)
                       return NULL;
               else
                       return PyErr_SetFromErrno(PyExc_OSError);
       }

In both versions of the program (print vs. sys.stdout), sem_wait() is
being interrupted and is returning -1 with errno set to EINTR. This is
what I expected. Also, in both cases it seems that the loop is
(correctly) terminating with PyErr_CheckSignals() returning non-zero.
This makes sense too; the call is executing our signal handler, and then
returning -1 since our particular handler raises SystemExit.

However, I suspect that depending on the exact code executed
for the signal handler, errno may or may not wind up being reset in
some nested call of PyErr_CheckSignals(). I believe that the
error checking code below the do-while (where sem_wait() is called),
needed errno to have the value set by sem_wait(), and the author
wasn't expecting anything else to have changed it. In the "print"
version, errno apparently winds up unchanged with EINTR, resulting in
the `return NULL' statement. In the "sys.stdout" version (and probably
many others), errno winds up being reset to 0, and the error handling
results in the `return PyErr_SetFromErrno(PyExc_OSError)' statement.

To patch this up, we can probably just save errno as, say, `wait_errno'
at the end of the loop body, and then use it within the error handling
block that follows. However, the rest of the code should probably be
checked for this type of issue.

----------
components: Library (Lib)
messages: 89804
nosy: ryles
severity: normal
status: open
title: multiprocessing: handling of errno after signals in sem_acquire()
type: behavior
versions: Python 2.6, Python 2.7, Python 3.0, Python 3.1, Python 3.2

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6362>
_______________________________________


More information about the Python-bugs-list mailing list