Donovan Baarda email@example.com writes:
I've Cc'ed this to zope-coders as it might affect other Zope developers and it had me stumped for ages. I couldn't find anything on it anywhere, so I figured it would be good to get something into google :-).
We are developing a Zope2.7 application on Debian GNU/Linux that is using fop to generate pdf's from xml-fo data. fop is a java thing, and we are using popen2.Popen3(), non-blocking mode, and select loop to write/read stdin/stdout/stderr. This was all working fine.
Then over the Christmas chaos, various things on my development system were apt-get updated, and I noticed that java/fop had started segfaulting. I tried running fop with the exact same input data from the command line; it worked. I wrote a python script that invoked fop in exactly the same way as we were invoking it inside zope; it worked. It only segfaulted when invoked inside Zope.
I googled and tried everything... switched from j2re1.4 to kaffe, rolled back to a previous version of python, re-built Zope, upgraded Zope from 2.7.2 to 2.7.4, nothing helped. Then I went back from a linux 2.6.8 kernel to a 2.4.27 kernel; it worked!
After googling around, I found references to recent attempts to resolve some signal handling problems in Python threads. There was one post that mentioned subtle differences between how Linux 2.4 and Linux 2.6 did signals to threads.
You've left out a very important piece of information: which version of Python you are using. I'm guessing 2.3.4. Can you try 2.4?
So it seems this is a problem with Python threads and Linux kernel 2.6. The attached program demonstrates that it has nothing to do with Zope. Using it to run "fop-test /usr/bin/fop </dev/null" on a Debian box with fop installed will show the segfault. Running the same thing on a machine with 2.4 kernel will instead get the fop "usage" message. It is not a generic fop/java problem with 2.6 because the commented un-threaded line works fine. It doesn't seem to segfault for any command... "cat -" works OK, so it must be something about java contributing.
After searching the Python bugs, the closest I could find was #971213 http://sourceforge.net/tracker/?group_id=5470&atid=105470&func=detail&aid=971213. Is this the same bug? Should I submit a new bug report? Is there any other way I can help resolve this?
I'd be astonished if this is the same bug.
The main oddness about python threads (before 2.3) is that they run with all signals masked. You could play with a C wrapper (call setprocmask, then exec fop) to see if this is what is causing the problem. But please try 2.4.
BTW, built in file objects really could use better non-blocking support... I've got a half-drafted PEP for it... anyone interested in it?
Err, this probably should be in a different mail :)