[Python-bugs-list] [ python-Bugs-465673 ] pthreads need signal protection

noreply@sourceforge.net noreply@sourceforge.net
Fri, 28 Sep 2001 09:08:20 -0700


Bugs item #465673, was opened at 2001-09-27 07:36
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=465673&group_id=5470

Category: Threads
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
>Assigned to: Guido van Rossum (gvanrossum)
Summary: pthreads need signal protection

Initial Comment:
I've been playing around with Python and threads, and
I've noticed some odd and often unstable behavior.  In
particular, on my Solaris 8 box I can get Python 1.5.2,
1.6, 2.0, or 2.1 to core dump every time with the
following sequence.  I've also seen this happen on
Solaris 6 (all UltraSPARC based):

1. Enter the following code into the interactive
interpreter:
--
import threading

def loopingfunc():
  while 1: pass

threading.Thread(target=loopingfunc).start()
--

2. Send a SIGINT signal (usually Ctrl-C, your terminal
settings may vary).  "Keyboard Interrupt" is displayed
and so far everything looks fine.

3. Now simply press the <Enter> key to enter a blank
line in the interpreter.  For my Solaris 8 box with the
GNU readline 2.2 module present, this always ends up in
a core dump.  It may take a while, since at this point
the readline signal handler is being re-entered
recursively until the stack overflows.

I've described this problem in the past on Usenet, but
didn't get much response.  For a more complete
discussion of the problem and a possible solution, see

http://groups.google.com/groups?hl=en&threadm=98osml%24sul%241%40newshost.mot.com&rnum=1&prev=/groups%3Fas_ugroup%3Dcomp.lang.python%26as_uauthors%3DJason%2520Lowe

(If the URL doesn't work, search groups.google.com for
posts by "Jason Lowe" in comp.lang.python and view the
entire thread of the result.)

Upon investigation of the problem, it looks like the
problem is caused by an interaction with pthreads and
signals.  The SIGINT signal is delivered to the thread
that is performing the spin loop, NOT the thread that
is in the readline() module.  Because the readline
module uses setjmp()/longjmp() for its signal handling,
the longjmp() ends up being executed by the wrong
thread with dire results.

Pthreads and signals don't mix very well, so one has to
be very careful to make sure everything works
properly.  A typical solution is to ensure signals are
only delivered to one thread by masking all signals in
all other threads.  I believe this will be the same
root cause of bug #219772 (Interactive InterPreter+
Thread -> core dump at exit).

I was able to solve the problem by modifying
Python/thread_pthread.h's PyThread_start_new_thread()
to block all signals with pthread_sigmask() after the
new thread was started.  This causes all threads
created by Python except the initial thread to have all
signals masked.  This forces signals to be delivered to
the main thread.  I don't believe anyone is depending
on the current behavior that signals will be delivered
to an indeterminate thread, so this change seems safe.
However I haven't run many other Python applications
that deal with threads and signals. 

I propose that on platforms that implement Python
threads with pthreads, the code masks all signals in
all threads except the initial, main thread.  This will
resolve the problem of signals being delivered to
threads indeterminately.  I think I can dig up my
initial code deltas if desired, or I can always
recreate them.  It's just a few lines to mask signals
in the thread before thread creation, then restore them
afterwards.  (This causes only the main thread to have
signals preserved.)

A side question from this is whether the thread module
(or posix module?) should expose the pthread_sigmask()
functionality to Python threads on a platform that uses
pthreads.  This would allow developers to manipulate
the signal masks of the Python threads so that a
particular signal can be routed to a particular
thread.  (They would mask this signal in all other
threads except the desired thread.)



----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-09-28 09:08

Message:
Logged In: YES 
user_id=6380

I don't have Solaris access, and I can't get this to break
on Linux. But I agree with your suggestion that posix
threads should block signals.

Are you capable of coming up with a patch that does that, in
a way that is independent of the specific platform (as long
as it has PTHREADS)? You may have to open a new issue in the
patch manager, since SF doesn't allow after-the-fact
attachments to anonymous entries. (Maybe SF logs you out
whenever you quit your browser? That's what it does for me.
:-)


----------------------------------------------------------------------

Comment By: Jason Lowe (jasonlowe)
Date: 2001-09-27 07:40

Message:
Logged In: YES 
user_id=56897

Ack.  SourceForge wants to log me out every few minutes, so
I wasn't logged in when I submitted this. Sorry 'bout that.


----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=465673&group_id=5470