[David Ascher]
test_fork1 fails on Linux with threads on SMP machines. (It's irrelevant without threads.) This is a hard failure -- the process can either SEGV or hang forever. Is this a showstopper? SMP boxes are becoming increasingly common both as servers and workstations.
Why does the test fail? I'd hate to see the thousands (nay, hundreds of thousands) of users complain that foo isn't working just because the test for a rarely used feature on a rare platform is broken.
Threads and fork() don't seem to mix on Linux. Even on a UP machine things seem strange: http://www.deja.com/[ST_rn=ps]/getdoc.xp?AN=613477888&fmt=text I tried to reproduce the problem with a C program but could not. When things hang the forking thread is stuck in wait4() while the child process is suspended: #0 0x4027d9ba in sigsuspend () from /lib/libc.so.6 #1 0x40232c77 in __pthread_wait_for_restart_signal () from /lib/libpthread.so.0 #2 0x4023406e in __pthread_lock () from /lib/libpthread.so.0 #3 0x4023186a in pthread_mutex_lock () from /lib/libpthread.so.0 #4 0x806fbaa in PyThread_release_lock (lock=0x81ebb68) at thread_pthread.h:339 #5 0x805617b in eval_code2 (co=0x81eca68, globals=0x81c4f64, locals=0x0, args=0x81be278, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, owner=0x0) at ceval.c:630 #6 0x805ac19 in call_function (func=0x81ebb2c, arg=0x81be26c, kw=0x0) at ceval.c:2552 #7 0x805a5a4 in PyEval_CallObjectWithKeywords (func=0x81ebb2c, arg=0x81be26c, kw=0x0) at ceval.c:2390 #8 0x80b2c7b in t_bootstrap (boot_raw=0x81ebb50) at ./threadmodule.c:224 #9 0x40230c8f in pthread_start_thread () from /lib/libpthread.so.0 I don't know if this is a LinuxThread problem or a Python problem. Neil -- The internet: Learn what you know. Share what you don't.
Threads and fork() don't seem to mix on Linux. Even on a UP machine things seem strange:
I believe it. My general point however is that even if the problem can't be fixed because Linux is broken in some way, the test suite should be fixed even if it means to ignore failures of test_fork1 if the system was configured --with-thread. --david
I tried to reproduce the problem with a C program but could not. When things hang the forking thread is stuck in wait4() while the child process is suspended:
This looks very suspect.
#0 0x4027d9ba in sigsuspend () from /lib/libc.so.6 #1 0x40232c77 in __pthread_wait_for_restart_signal () from /lib/libpthread.so.0 #2 0x4023406e in __pthread_lock () from /lib/libpthread.so.0 #3 0x4023186a in pthread_mutex_lock () from /lib/libpthread.so.0 #4 0x806fbaa in PyThread_release_lock (lock=0x81ebb68) at thread_pthread.h:339 #5 0x805617b in eval_code2 (co=0x81eca68, globals=0x81c4f64,
And very much like the Python thread-state is not being managed correctly with fork. From my understanding of fork (which is small), and of the Python thread-state management, this doesnt surprise me. Given the stack trace, it appears that Python is doing its periodic thread-release as part of running around the main loop. In the process of _releasing_ the thread-lock, it needs to _acquire_ a mutex. I dont know the Python threading on pthreads at all - would this be correct (it would seem likely that such an implementation would be done). But in the process of acquiring that mutex, we call __pthread_wait_for_restart_signal() Is it possible that is is something as simple as thread-idents changing underneath Python when using fork? It seems to me that the thread thinks it is either new, or stale? Just my 2c worth - and given my knowledge of Linux and pthreads, plus the state of our dollar at the moment, it has better be AUD $0.02 :-) Mark.
participants (3)
-
David Ascher -
Mark Hammond -
Neil Schemenauer