[Python-ideas] Threading hooks and disable gc per thread

Thomas Heller theller at ctypes.org
Fri May 27 12:04:40 CEST 2011


Am 12.05.2011 01:58, schrieb Christian Heimes:
> Hello,
>
> today I've spent several hours debugging a segfault in JCC [1]. JCC is a
> framework to wrap Java code for Python. It's most prominently used in
> PyLucene [2]. You can read more about my debugging in [3]
>
> With JCC every Python thread must be registered at the JVM through JCC.
> An unattached thread, that accesses a wrapped Java object, leads to
> errors and may even cause a segfault. Accessing also includes garbage
> collection. A code line like
>
>     a = {}
>
> or
>     "a b c".split()
>
> can segfault since the allocation of a dict or a bound method runs
> through _PyObject_GC_New(), which may trigger a cyclic garbage
> collection run. If the current thread isn't attached to the JVM but
> triggers a gc.collect() with some Java objects in a cycle, the
> interpreter crashes. It's quite complicated and hard to "fix" third
> party tools to attach all threads created in the third party library.

I have a somewhat similar problem and just noticed this thread.
In our software, we have multiple threads, and we use a lot of COM
objects.
COM object also have the requirement that they must only be used in the
same thread (in the same apartment, to be exact) that created them.
This also applies to cleaning up with the garbage collector.

Ok, when the com object is part of some Python structures that include
reference cycles, then the cycle gc tries to clean up the ref cycle and
cleans up the COM object.  This can happen in ANY thread, and in some 
cases the program crashes or the thread hangs.

Here is my idea to fix this from within Python:
The COM objects, when created, keep the name of the currently executing
thread. In the __del__ method, where the cleanup of the COM object
happens by calling the COM .Release() method, a check is made if the
current thread is the allowed one or not.  If it is the wrong thread,
the COM object is kept alive by appending it to some list. The list is
stored in a global dictionary indexed by the thread name.

The remaining goal is to clear the lists in the dict inside the valid
thread - which is done on every creation of a COM object, on every
destruction of a COM object, and in the CoUninitialize function that
every thread using COM must call before it is ending.  At least that's
my plan.

Maybe you can use a similar approach?

Thomas




More information about the Python-ideas mailing list