
Am 12.05.2011 01:58, schrieb Christian Heimes:
Hello,
today I've spent several hours debugging a segfault in JCC [1]. JCC is a framework to wrap Java code for Python. It's most prominently used in PyLucene [2]. You can read more about my debugging in [3]
With JCC every Python thread must be registered at the JVM through JCC. An unattached thread, that accesses a wrapped Java object, leads to errors and may even cause a segfault. Accessing also includes garbage collection. A code line like
a = {}
or "a b c".split()
can segfault since the allocation of a dict or a bound method runs through _PyObject_GC_New(), which may trigger a cyclic garbage collection run. If the current thread isn't attached to the JVM but triggers a gc.collect() with some Java objects in a cycle, the interpreter crashes. It's quite complicated and hard to "fix" third party tools to attach all threads created in the third party library.
I have a somewhat similar problem and just noticed this thread. In our software, we have multiple threads, and we use a lot of COM objects. COM object also have the requirement that they must only be used in the same thread (in the same apartment, to be exact) that created them. This also applies to cleaning up with the garbage collector. Ok, when the com object is part of some Python structures that include reference cycles, then the cycle gc tries to clean up the ref cycle and cleans up the COM object. This can happen in ANY thread, and in some cases the program crashes or the thread hangs. Here is my idea to fix this from within Python: The COM objects, when created, keep the name of the currently executing thread. In the __del__ method, where the cleanup of the COM object happens by calling the COM .Release() method, a check is made if the current thread is the allowed one or not. If it is the wrong thread, the COM object is kept alive by appending it to some list. The list is stored in a global dictionary indexed by the thread name. The remaining goal is to clear the lists in the dict inside the valid thread - which is done on every creation of a COM object, on every destruction of a COM object, and in the CoUninitialize function that every thread using COM must call before it is ending. At least that's my plan. Maybe you can use a similar approach? Thomas