[ python-Bugs-1069160 ] PyThreadState_SetAsyncExc segfault

SourceForge.net noreply at sourceforge.net
Fri Aug 11 00:55:36 CEST 2006


Bugs item #1069160, was opened at 2004-11-18 20:48
Message generated for change (Comment added) made by tim_one
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1069160&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Threads
Group: Python 2.4
Status: Closed
Resolution: None
Priority: 6
Submitted By: Tim Peters (tim_one)
Assigned to: Guido van Rossum (gvanrossum)
Summary: PyThreadState_SetAsyncExc segfault

Initial Comment:
PyThreadState_SetAsyncExc() crawls over the list of 
tstates without holding a mutex.  Python implementation 
code has tried to get away with this kind of thing before 
(& more than once <wink>:  and segfaults are always 
eventually reported.

A common cause is that PyThreadState_DeleteCurrent() 
is typically called *without* the GIL held, so any thread 
can try to delete its own tstate *while* 
PyThreadState_SetAsyncExc() is trying to muck with 
that tstate.  That the GIL is held by 
PyThreadState_SetAsyncExc's caller is no protection.  
Instead the code has to serialize against tstate chain 
mutations by acquiring head_mutex for the duration.

Of course this is rare and can be virtually impossible to 
reproduce, but that makes the bug worse, not "better".





----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2006-08-10 18:55

Message:
Logged In: YES 
user_id=31435

For Python 2.5 (rev 51195), I changed the code as suggested
in my 2005-02-06 comment; changed the docs to say the return
value is always 0 or 1; and added a (ctypes-based) test case
to test_threading.py.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2005-02-07 21:27

Message:
Logged In: YES 
user_id=31435

Backport:  sure, why not.  Just don't ask anyone to test it 
<wink>.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2005-02-07 21:19

Message:
Logged In: YES 
user_id=80475

Should the head lock's be backported to Py2.4?

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2005-02-07 21:08

Message:
Logged In: YES 
user_id=6380

OK, to prevent further waste of time I've checked in the
proposed fix.

I have no idea what the count>1 comment was referring to any
more; I'm sure I wrote it though.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2005-02-07 15:24

Message:
Logged In: YES 
user_id=31435

So we're peeing away time on a Mystery Function with no 
known actual uses in the entire world.  Cool <wink>.

----------------------------------------------------------------------

Comment By: Alex Martelli (aleax)
Date: 2005-02-07 15:11

Message:
Logged In: YES 
user_id=60314

I've checked the current codebase of the client for whom I originally 
wrote code using PyThreadState_SetAsyncExc, and that code is not 
currently used in production.  It did that call while holding the GIL, FWIW.


----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2005-02-07 14:17

Message:
Logged In: YES 
user_id=92689

To be honest: I don't yet have any code that relies on the feature, so 
there's nothing to break. Don't know about Alex, I'll ask him.

I did a quick test (through ctypes), and the function works well with and 
without patch. I've attached the test script; it's mostly due to Peter 
Hansen: http://groups-beta.google.com/group/comp.lang.python/msg/
d310502f7c7133a9

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2005-02-07 10:41

Message:
Logged In: YES 
user_id=31435

Just, this function isn't called anywhere in the Python 
distribution, so Armin and Guido and I have no certain way to 
know that it doesn't break the feature completely.

The presumption here is that you do have code that uses this 
feature.  If so, we're just asking that you verify your code 
still works after applying the patch.  OTOH, if you don't have 
any code that uses this function, then there's no reason for 
you to get sucked into this.

----------------------------------------------------------------------

Comment By: Armin Rigo (arigo)
Date: 2005-02-07 06:27

Message:
Logged In: YES 
user_id=4771

I didn't worry too much about ZAP deadlocks precisely because this 
problem was already there elsewhere.  I guess that if this is a problem, it 
should be discussed and fixed in another bug report, after we apply the 
obvious two-liner patch of the current tracker. 

----------------------------------------------------------------------

Comment By: Just van Rossum (jvr)
Date: 2005-02-07 04:04

Message:
Logged In: YES 
user_id=92689

I'm not able to judge the patch. I'm only the one who had the feature 
request (together with Alex Martelli), but I know next to nothing about 
the implementation and all the subtle details. If you need a volunteer to 
apply a patch, sure, I'll gladly help, but the discussion is way over my 
head here.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2005-02-06 20:51

Message:
Logged In: YES 
user_id=31435

It's already documented that PyThreadState_SetAsyncExc() 
must be called with the GIL held.  head_mutex is only used to 
protect traversal/mutation of the threadstate chain, so is 
only held internally by functions that create, or destroy, 
thread states or interpreter states.  It's certainly possible to 
get a deadlock then after the patch, not due to the GIL, but 
due to head_mutex alone.  I doubt that any __del__ method 
would create/destroy thread/interpreter states on its own, 
the bigger danger is that once a __del__ method is invoked, 
other threads can run too, and they may do anything 
(including calling PyThreadState_SetAsyncExc() again!  that 
seem the most likely way to deadlock).

It would be good to have a clue about "I have no idea why 
this function warns about return values greater than 1".  If in 
fact it should be impossible to have a count greater than 1 
given that concurrent mutations to the tstate chain are 
locked out after the patch, then the patch could be rewritten 
to release head_mutex when it found p->thread_id == id the 
first time, before the ZAP.  After the ZAP, the function would 
just return:

    if (p->thread_id == id) {
        PyObject *temp = p->async_exc;
        Py_XINCREF(exc);
        p->async_exc = exc;
        HEAD_UNLOCK();
        Py_XDECREF(temp);
        return 1;
    }

Then deadlock couldn't occur.

I should note that PyInterpreterState_Clear() is prone to the 
same kind of deadlocks on head_mutex (it calls 
PyThreadState_Clear() with head_mutex held, and the latter 
does a lot of ZAPping).

----------------------------------------------------------------------

Comment By: Guido van Rossum (gvanrossum)
Date: 2005-02-06 19:44

Message:
Logged In: YES 
user_id=6380

But it was Just's idea, so assigning to him. Q for Just:
would applying the patch cause any problems in your code
that's using this feature?

(Q. for Tim: hoes does the head_mutex in this file relate to
the GIL? The ZAP() call would seem to require that the GIL
is already head when this fn. is called, because it could
invoke arbitrary Python __del__ code. Is there any other
call that could acquire these locks in the opposite order,
causing deadlocks?)

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2005-02-03 11:41

Message:
Logged In: YES 
user_id=31435

Yup, that patch is all I had in mind.  It's a shame that nobody 
cares enough about this function to test it (which isn't a dig 
at you, Armin -- I don't care enough either <0.5 wink>).

BTW, I have no idea why this function warns about return 
values greater than 1.  It's possible that the original author 
realized the tstate chain could mutate while the loop was 
running, and that's where "greater than 1" came from.  If so, 
locking head_mutex should make that impossible ... OK, some 
quality time with CVS shows that Guido checked this function 
in, so assigning the report to him.

----------------------------------------------------------------------

Comment By: Armin Rigo (arigo)
Date: 2005-02-02 06:19

Message:
Logged In: YES 
user_id=4771

I guess that the patch you have in mind is just to protect the loop with HEAD_LOCK/HEAD_UNLOCK.  I cannot test this patch (attached), though, as I don't have any code that uses this function.  I can only tell that it also looks like a reasonable thing to do to me.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-11-19 10:32

Message:
Logged In: YES 
user_id=31435

Changed Category to Threads.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1069160&group_id=5470


More information about the Python-bugs-list mailing list