Long-time shy failure in test_socket_ssl
Has anyone else noticed this? For a long time (possibly years), I see an infrequent error in test_socket_ssl, like so (this is on WinXP Pro): test_socket_ssl test test_socket_ssl crashed -- exceptions.TypeError: 'NoneType' object is not callable I haven't been able to provoke it by running test_socket_ssl in a loop, and I don't have a guess about what's going wrong by eyeballing the code. test_rude_shutdown() is dicey, relying on a sleep() instead of proper synchronization to make it probable that the `listener` thread goes away before the main thread tries to connect, but while that race may account for bogus TestFailed deaths, it doesn't seem possible that it could account for the kind of failure above.
[Tim Peters]
... test_rude_shutdown() is dicey, relying on a sleep() instead of proper synchronization to make it probable that the `listener` thread goes away before the main thread tries to connect, but while that race may account for bogus TestFailed deaths, it doesn't seem possible that it could account for the kind of failure above.
Well, since it's silly to try to guess about one weird failure when a clear cause for another kind of weird failure is known, I checked in changes to do "proper" thread synchronization and termination in that test. Hasn't failed here since, but that's not surprising (it was always a "once in a light blue moon" kind of thing). I'm not sure how/whether this test is supposed to work with Jython -- perhaps the `thread.exit()` I removed could be important there. The test relies on that a socket gets closed when a socket object becomes trash & is reclaimed; in CPython that's easy to control; I don't know why the test didn't/doesn't simply do an explicit s.close() instead.
[1/24/06, Tim Peters]
... test_rude_shutdown() is dicey, relying on a sleep() instead of proper synchronization to make it probable that the `listener` thread goes away before the main thread tries to connect, but while that race may account for bogus TestFailed deaths, it doesn't seem possible that it could account for the kind of failure above.
[Tim Peters]
Well, since it's silly to try to guess about one weird failure when a clear cause for another kind of weird failure is known, I checked in changes to do "proper" thread synchronization and termination in that test. Hasn't failed here since, but that's not surprising (it was always a "once in a light blue moon" kind of thing).
Neal plugged another hole later, but-- alas --I have seen the same shy failure since then on WinXP. One of the most recent buildbot test runs saw it too, on a non-Windows box: http://www.python.org/dev/buildbot/trunk/g5%20osx.3%20trunk/builds/204/step-... test_socket_ssl test test_socket_ssl crashed -- exceptions.TypeError: 'NoneType' object is not callable in the second test run there. Still no theory! Maybe we can spend the next 3 days sprinting on it :-)
On 2/27/06, Tim Peters <tim.peters@gmail.com> wrote:
Neal plugged another hole later, but-- alas --I have seen the same shy failure since then on WinXP. One of the most recent buildbot test runs saw it too, on a non-Windows box:
http://www.python.org/dev/buildbot/trunk/g5%20osx.3%20trunk/builds/204/step-...
test_socket_ssl test test_socket_ssl crashed -- exceptions.TypeError: 'NoneType' object is not callable
in the second test run there.
For closure, I believe this problem was addressed by revs 42842 and 42844 to Lib/test/test_importhooks.py. If anyone sees spurious failures with the buildbot (one time failures, crashes, etc), please report the problems to python-dev. It would be great to see if you can reproduce the results with the same tests that failed. We need to determine if it is architecture specific, test-order related, or something else. n
[Tim Peters]
... test_socket_ssl test test_socket_ssl crashed -- exceptions.TypeError: 'NoneType' object is not callable
[Neal Norwitz]
For closure, I believe this problem was addressed by revs 42842 and 42844 to Lib/test/test_importhooks.py.
Amazingly, the same thing popped up 3(!) times today in the 2.4 buildbot slaves (on two Windows boxes, and the "amd64 gentoo 2.4" box). I merged those revs to the 2.4 branch.
If anyone sees spurious failures with the buildbot (one time failures, crashes, etc), please report the problems to python-dev. It would be great to see if you can reproduce the results with the same tests that failed. We need to determine if it is architecture specific, test-order related, or something else.
One-shot segfaults are still common enough on the Mac box, like the very recent: <http://www.python.org/dev/buildbot/all/g4%20osx.4%202.4/builds/18/step-test/0>
On 3/19/06, Tim Peters <tim.peters@gmail.com> wrote:
If anyone sees spurious failures with the buildbot (one time failures, crashes, etc), please report the problems to python-dev. It would be great to see if you can reproduce the results with the same tests that failed. We need to determine if it is architecture specific, test-order related, or something else.
One-shot segfaults are still common enough on the Mac box, like the very recent: <http://www.python.org/dev/buildbot/all/g4%20osx.4%202.4/builds/18/step-test/0>
Most of the recent failures seem to be related to threads. the most recent info is below. There's a lot more in case if it can help someone debug. It seems that one thread is in pthread_cond_wait and another thread is in PyObject_Call. n -- Exception: EXC_BAD_ACCESS (0x0001) Codes: KERN_INVALID_ADDRESS (0x0001) at 0xdbdbdc23 Thread 0: 0 libSystem.B.dylib 0x9002b8a8 semaphore_wait_signal_trap + 8 1 libSystem.B.dylib 0x9003001c pthread_cond_wait + 488 2 python.exe 0x0010a778 PyThread_acquire_lock + 240 (thread_pthread.h:416) 3 python.exe 0x000cdb84 PyEval_RestoreThread + 116 (ceval.c:306) 4 python.exe 0x000e0d58 file_dealloc + 160 (fileobject.c:310) 5 python.exe 0x00030b10 _Py_Dealloc + 68 (object.c:1872) 6 python.exe 0x000a02a4 tupledealloc + 348 (tupleobject.c:168) 7 python.exe 0x00030b10 _Py_Dealloc + 68 (object.c:1872) 8 python.exe 0x000dcb04 call_function + 1912 (ceval.c:3565) 9 python.exe 0x000d681c PyEval_EvalFrame + 34752 (ceval.c:2163) 10 python.exe 0x000dcfe0 fast_function + 464 (ceval.c:3645) 11 python.exe 0x000dccb0 call_function + 2340 (ceval.c:3584) 12 python.exe 0x000d681c PyEval_EvalFrame + 34752 (ceval.c:2163) 13 python.exe 0x000d9680 PyEval_EvalCodeEx + 4480 (ceval.c:2736) 14 python.exe 0x00118114 function_call + 556 (funcobject.c:548) 15 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 16 python.exe 0x000ddd24 ext_do_call + 764 (ceval.c:3840) 17 python.exe 0x000d6b10 PyEval_EvalFrame + 35508 (ceval.c:2203) 18 python.exe 0x000d9680 PyEval_EvalCodeEx + 4480 (ceval.c:2736) 19 python.exe 0x00118114 function_call + 556 (funcobject.c:548) 20 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 21 python.exe 0x000cadd4 instancemethod_call + 736 (classobject.c:2447) 22 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 23 python.exe 0x0005681c slot_tp_call + 112 (typeobject.c:4533) 24 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 25 python.exe 0x000dd90c do_call + 168 (ceval.c:3771) 26 python.exe 0x000dccd0 call_function + 2372 (ceval.c:3586) 27 python.exe 0x000d681c PyEval_EvalFrame + 34752 (ceval.c:2163) 28 python.exe 0x000d9680 PyEval_EvalCodeEx + 4480 (ceval.c:2736) 29 python.exe 0x00118114 function_call + 556 (funcobject.c:548) 30 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 31 python.exe 0x000ddd24 ext_do_call + 764 (ceval.c:3840) 32 python.exe 0x000d6b10 PyEval_EvalFrame + 35508 (ceval.c:2203) 33 python.exe 0x000d9680 PyEval_EvalCodeEx + 4480 (ceval.c:2736) 34 python.exe 0x00118114 function_call + 556 (funcobject.c:548) 35 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 36 python.exe 0x000cadd4 instancemethod_call + 736 (classobject.c:2447) 37 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 38 python.exe 0x0005681c slot_tp_call + 112 (typeobject.c:4533) 39 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 40 python.exe 0x000dd90c do_call + 168 (ceval.c:3771) 41 python.exe 0x000dccd0 call_function + 2372 (ceval.c:3586) 42 python.exe 0x000d681c PyEval_EvalFrame + 34752 (ceval.c:2163) 43 python.exe 0x000d9680 PyEval_EvalCodeEx + 4480 (ceval.c:2736) 44 python.exe 0x00118114 function_call + 556 (funcobject.c:548) 45 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 46 python.exe 0x000ddd24 ext_do_call + 764 (ceval.c:3840) 47 python.exe 0x000d6b10 PyEval_EvalFrame + 35508 (ceval.c:2203) 48 python.exe 0x000d9680 PyEval_EvalCodeEx + 4480 (ceval.c:2736) 49 python.exe 0x00118114 function_call + 556 (funcobject.c:548) 50 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 51 python.exe 0x000cadd4 instancemethod_call + 736 (classobject.c:2447) 52 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 53 python.exe 0x0005681c slot_tp_call + 112 (typeobject.c:4533) 54 python.exe 0x00028e18 PyObject_Call + 96 (abstract.c:1795) 55 python.exe 0x000dd90c do_call + 168 (ceval.c:3771) 56 python.exe 0x000dccd0 call_function + 2372 (ceval.c:3586) 57 python.exe 0x000d681c PyEval_EvalFrame + 34752 (ceval.c:2163) 58 python.exe 0x000dcfe0 fast_function + 464 (ceval.c:3645) 59 python.exe 0x000dccb0 call_function + 2340 (ceval.c:3584) 60 python.exe 0x000d681c PyEval_EvalFrame + 34752 (ceval.c:2163) 61 python.exe 0x000d9680 PyEval_EvalCodeEx + 4480 (ceval.c:2736) 62 python.exe 0x000dd150 fast_function + 832 (ceval.c:3656) 63 python.exe 0x000dccb0 call_function + 2340 (ceval.c:3584) 64 python.exe 0x000d681c PyEval_EvalFrame + 34752 (ceval.c:2163) 65 python.exe 0x000d9680 PyEval_EvalCodeEx + 4480 (ceval.c:2736) 66 python.exe 0x000dd150 fast_function + 832 (ceval.c:3656) 67 python.exe 0x000dccb0 call_function + 2340 (ceval.c:3584) 68 python.exe 0x000d681c PyEval_EvalFrame + 34752 (ceval.c:2163) 69 python.exe 0x000dcfe0 fast_function + 464 (ceval.c:3645) 70 python.exe 0x000dccb0 call_function + 2340 (ceval.c:3584) 71 python.exe 0x000d681c PyEval_EvalFrame + 34752 (ceval.c:2163) 72 python.exe 0x000d9680 PyEval_EvalCodeEx + 4480 (ceval.c:2736) 73 python.exe 0x000dd150 fast_function + 832 (ceval.c:3656) 74 python.exe 0x000dccb0 call_function + 2340 (ceval.c:3584) 75 python.exe 0x000d681c PyEval_EvalFrame + 34752 (ceval.c:2163) 76 python.exe 0x000d9680 PyEval_EvalCodeEx + 4480 (ceval.c:2736) 77 python.exe 0x000dd150 fast_function + 832 (ceval.c:3656) 78 python.exe 0x000dccb0 call_function + 2340 (ceval.c:3584) 79 python.exe 0x000d681c PyEval_EvalFrame + 34752 (ceval.c:2163) 80 python.exe 0x000d9680 PyEval_EvalCodeEx + 4480 (ceval.c:2736) 81 python.exe 0x000ce040 PyEval_EvalCode + 84 (ceval.c:484) 82 python.exe 0x00013c7c run_node + 120 (pythonrun.c:1265) 83 python.exe 0x00013be0 run_err_node + 88 (pythonrun.c:1252) 84 python.exe 0x00013b6c PyRun_FileExFlags + 180 (pythonrun.c:1243) 85 python.exe 0x00011fbc PyRun_SimpleFileExFlags + 712 (pythonrun.c:860) 86 python.exe 0x00011408 PyRun_AnyFileExFlags + 168 (pythonrun.c:664) 87 python.exe 0x00008750 Py_Main + 3820 (main.c:493) 88 python.exe 0x000028d8 main + 40 (python.c:23) 89 python.exe 0x00002108 _start + 340 (crt.c:272) 90 python.exe 0x00001fb0 start + 60 Thread 1 Crashed: 0 python.exe 0x00028de8 PyObject_Call + 48 (abstract.c:1794) 1 python.exe 0x0002901c PyObject_CallFunction + 360 (abstract.c:1837) 2 _testcapi.so 0x049903d4 _make_call + 64 (_testcapimodule.c:569) 3 libSystem.B.dylib 0x9002b200 _pthread_body + 96
Neal Norwitz wrote:
On 3/19/06, Tim Peters <tim.peters@gmail.com> wrote:
If anyone sees spurious failures with the buildbot (one time failures, crashes, etc), please report the problems to python-dev. It would be great to see if you can reproduce the results with the same tests that failed. We need to determine if it is architecture specific, test-order related, or something else. One-shot segfaults are still common enough on the Mac box, like the very recent: <http://www.python.org/dev/buildbot/all/g4%20osx.4%202.4/builds/18/step-test/0>
Most of the recent failures seem to be related to threads. the most recent info is below. There's a lot more in case if it can help someone debug. It seems that one thread is in pthread_cond_wait and another thread is in PyObject_Call.
Hrrm, test_thread_state hands an object pointer off to a different thread without INCREF'ing it first. Could there be a race condition with the deallocation of arguments at the end of test_thread_state? Specifically, that trace looks to me like the spawned thread is trying to call the function passed in while the main thread is trying to delete the argument tuple passed to test_thread_state. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- http://www.boredomandlaziness.org
participants (3)
-
Neal Norwitz
-
Nick Coghlan
-
Tim Peters