[Python-Dev] segfaults on Mac (was Re: Long-time shy failure in test_socket_ssl))

Tim Peters tim.peters at gmail.com
Tue Mar 21 03:38:11 CET 2006


[Nick Coghlan]
> Hrrm, test_thread_state hands an object pointer off to a different thread
> without INCREF'ing it first. Could there be a race condition with the
> deallocation of arguments at the end of test_thread_state?
>
> Specifically, that trace looks to me like the spawned thread is trying to call
> the function passed in while the main thread is trying to delete the argument
> tuple passed to test_thread_state.

Good eye!  I haven't been able to _provoke_ that failure on Windows
yet, but it sure looks promising.  Compounding the problem is that the
thread tests in test_capi.py are set up incorrectly:  due to "hidden"
fighting over Python's import lock, the second TestThreadState here
(the one run in the new thread) can't really start before test_capi.py
(when run via regrtest.py) finishes:

if have_thread_state:
    TestThreadState()
    import threading
    t=threading.Thread(target=TestThreadState)  THIS ONE
    t.start()

The dramatic <wink> way to show that is to add "t.join()". 
test_capi.py deadlocks then (not a problem with t.join(), the problem
is that TestThreadState is hung trying to do its imports, which it
can't do before regrtest.py's import of test_capi.py completes,
because of the hidden import lock; so the thread never makes progress,
and t.join() waits forever).  I can fix that easily (and will), but I
want to provoke "a problem" on my box first.

The consequence is that the instance of TestThreadState run in the
thread ends up running in parallel with later tests run by
regrtest.py.  Therefore it's not surprising to see a segfault due to
test_thread_state occur a few tests after regrtest.py believes
test_capi has completed.

Later:  by adding strategic little counters and conditional sleeps all
over the place in the C code, I was able to reproduce a segfault on
Windows.  I'm not yet sure how to fix it (repairing the structure of
test_capi.py would make it more likely that the segfault would occur
closer to the time test_capi runs, but the segfault actually occurs in
a non-threading.py thread spawned by the _C_ test code, and there's
nothing we can do in test_capi.py to wait for that thread to finish).


More information about the Python-Dev mailing list