
[Tim]
When the Zope3 tests are run under Python 2.3, after the test runner ends we usually get treated to a long string of these things:
""" Unhandled exception in thread started by Error in sys.excepthook: Original exception was:
"""
[bunch of analysis deleted]
... Event.wait() with a timeout ends up in _Condition.wait(), where a lazy busy loop wakes up about 20 times per second to see whether it can proceed.
For some reason an exception is getting raised in the wait() code. I'm not exactly sure what or why yet, but that will come soon enough.
It didn't. The primary effect of adding some vanilla debugging prints to threading.py's _Condition.wait() was to make Python die with segfaults at shutdown time instead. If Python's *second* call to PyGC_collect() in Py_Finalize() is commented out (the call that occurs after /* Destroy all modules */ PyImport_Cleanup(); ), all the problems go away, including the nonsense errors sprayed out at the end of Zope3 test runs. Recall that the nonsense errors are caused by a dozen stale daemon threads trying to execute Python code after the interpreter has been severely torn down, and they get the *chance* to do this because the second PyGC_collect() finds trash with Python __del__ methods (so PyGC_collect() loses the GIL when calling the __del__ methods, and all the daemon threads can proceed then). Note that this isn't a problem with code *in* __del__ methods! That makes it a different kind of shutdown glitch than we've usually wrestled with. It's a problem with Python code that has nothing to do with __del__; __del__'s only contribution is to release the GIL. Because the second PyGC_collect() *is* finding addtional finalizers to run, it's unattractive to stop calling it (getting more user-defined finalizers to run was the purpose of adding these PyGC_collect() shutdown calls to 2.3). OTOH, any __del__ method that runs after PyImport_Cleanup() will be (AFAICT) just as vulnerable to producing nonsense errors and segfaults as the code in _Condition.wait() has proven to be (sys is useless by that point, and all the Python internal code sucking basic objects out of sys isn't expecting to get None back). Maybe we should remove the second PyGC_collect() call before more apps run into these mysteries. Maybe we should delay tearing down sys as a special case (even more of a special case than it is now). Maybe the Zope3 tests should stop leaving an ever-growing number of daemon threads around (which appears to be the only solution so long as they're run under Python 2.3).