
We do have a real problem here, and I keep stumbling across it. So far, this issue has hit me in the win32 extensions, in Mozilla's PyXPCOM, and even in Gordon's "installer". IMO, the reality is that the Python external thread-state API sucks. I can boldly make that assertion as I have heard many other luminaries say it before me. As Tim suggests, time is the issue.
I fear the only way to approach this is with a PEP. We need to clearly state our requirements, and clearly show scenarios where interpreter states, thread states, the GIL etc all need to cooperate. Eg, InterpreterState's seem YAGNI, but manage to complicate using ThreadStates, which are certainly YNI. The ability to "unconditionally grab the lock" may be useful, as may a construct meaning "I'm calling out to/in from an external API" discrete from the current singular "release/acquire the GIL" construct available today.
I'm willing to help out with this, but not take it on myself. I have a fair bit to gain - if I can avoid toggling locks every time I call out to each and every function there would be some nice perf gains to be had, and horrible code to remove.
I welcome a PEP on this! It's above my own level of expertise, mostly because I'm never in a position to write code that runs into this... --Guido van Rossum (home page: http://www.python.org/~guido/)