![](https://secure.gravatar.com/avatar/713ff48a0efeccf34727d8938993c30a.jpg?s=120&d=mm&r=g)
Am 14.05.2011 21:21, schrieb Gregory P. Smith:
Makes sense to me.
Something that needs clarifying: when the process dies (main python thread has exited and all remaining python threads are daemon threads) the on thread end hook will _not_ be called.
Good catch! This gotcha should be mentioned in the docs. A daemon thread can end at any point in its life cycle. It's not an issue for my use case. For JCC the hook just frees some resources that are freed anyway when the process ends. Other use cases may need a more deterministic cleanup, but that's out of the scope for my proposal. Users can get around the issue with an atexit hook, though.
This also sounds useful since we are a long long way from concurrent gc. (and whenever we gain that, we'd need a way to control when it can or can't happen or to register the gc threads with the anything that needs to know about 'em, JCC, etc..)
I though of a concurrent GC, too. A dedicated GC thread could improve response time of a GUI or web application if we could separate the cyclic garbage detection into two steps. Even on a fast machine, a full GC sweep with millions of objects in gen2 can take a long time up to a second, in which the interpreter is locked. I assume that the scanning a million objects takes most of the time. If it would be possible to have a scan without the GIL held and then remove the objects in a second step with the GIL acquired, response time could increase. However that would require a major redesign of the traverse and visit slots. Back to my proposal. My initial proposal was missing one feature. It should be possible to alter the default setting for PyThreadState->gc_enabled, too. JCC could use the additional API to make sure, non attached threads don't run the GC. Example how JCC could use the feature: lucene.initVM() initializes the Java VM and attaches the current thread. This is usually done in the main thread before any other thread is started. The function would call PyThread_set_gc_enabled(0) to set the default value for new thread states and to prevent any new thread from starting a cyclic GC collect. lucene.getVM().attachCurrentThread() creates some thread local objects in a TLS and registers the current thread at the Java VM. This would run PyObject_GC_set_thread_enabled(1) to allow GC collect in the current thread. lucene.getVMEnv().detachCurrentThread() cleans up the TLS and unregisters the thread, so a PyObject_GC_set_thread_enabled(0) is required. The implementation is rather simple: - a new static int variable for the default setting and a new flag in the PyThreadState struct - check PyThreadState_Get()->gc_enabled in _PyObject_GC_Malloc() - four small functions to set and get the default and thread setting - three Python functions in the gc module to enable, disable and get the flag from the current PyThreadState - a function to get the global flag. I'm not sure if we should expose the global switch for Python code. The attached patch already has all C functionality. If I hear more +1, then I'll write two small PEPs for both feature requests. Christian