| The fork does some other things though, and it's hard to say which one is | affecting the execution of your program, especially since PyLucene is doing | a bunch of things at the machine-code level which are highly surprising to | Python programmers.
I'm too old to be highly surprised any more, and I'm barely a Python programmer... :-)
| Have you tried running 'strace' on this process yet?
No, I hadn't, thanks. Running ktrace sheds a little light on things. The final output before it writes the core file is:
7826 Python CALL write(0x2,0xbfff971f,0x18) 7826 Python GIO fd 2 wrote 24 bytes "thread_get_state failed " 7826 Python RET write 24/0x18 7826 Python CALL sigprocmask(0x3,0xbfff9b08,0) 7826 Python RET sigprocmask 0 7826 Python CALL kill(0x1e92,0x6) 7826 Python RET kill 0 7826 Python PSIG SIGABRT SIG_DFL 7826 Python NAMI "/cores/core.7826"
Nothing fails before that. The "thread_get_state failed\n" message does not appear in the twisted server log. The kill is an abort to this process (0x1e92 = 7826). So something (but not a system call) has gone wrong and the code has called abort. That at least explains why the server abruptly dies.
Running gdb python /cores/core.7826 and using where/bt provides no useful info:
(gdb) where #0 0x00000000 in _mh_dylib_header ()
From the kdump output, the process is clearly in the middle of doing
PyLucene things (there are a bunch of access calls to files that are in my PyLucene index directories).
Grepping for 'thread_get_state failed' in Python, Twisted, and PyLucene gets me just one hit:
$ grep -i 'thread_get_state failed' /usr/local/lib/* Binary file /usr/local/lib/libgcj.6.dylib matches
Which is a GCJ library file distributed with the Mac OS X binary version of PyLucene. Google thread_get_state failed gives some leads.
I'll go bug the PyLucene folks now... :-)
Thanks again for the trace suggestion.