Extension modules, Threading, and the GIL
Martin v. Löwis
martin at v.loewis.de
Thu Jan 2 22:19:48 CET 2003
Greg Chapman <glc at well.com> writes:
> It seems to me that threadstate could be made global, or at least
> thread local, within Python, thus freeing client code from ever
> having to explicitly create a threadstate. For this to work, Python
> would have to have the equivalent of thread local storage on all
> supported platforms.
This is a k.o. criterion. Python currently supports 11 threading
libraries, and a process to remove support for some of them will need
several releases. There currently is no support for TLS, *except
through the thread state itself*.
> Looking over the thread-sig archives, Greg Stein suggested that TLS
> could be emulated on platforms which don't offer it natively using a
> Python dict and a lock (at any rate, it should be possible with some
> sort of synchronized data structure).
This points to another sore spot: that would require reliable thread
identification. Currently, thread identification is broken, as Python
assumes that an int is sufficient to encapsulate a thread id. There
are platforms where this assumption is invalid.
> With only one threadstate per thread, a thread could easily
> determine whether it has the GIL (the threadstate could have some
> sort of active flag which gets set when it obtains the GIL); this
> might solve David Abraham's problem (not sure).
That is, of course, expensive: Every lock/unlock call needs to find
the TLS as well.
> (It could also allow a thread to call AcquireThread multiple times
> without deadlock; since there would be only one threadstate per
> thread, that state could preserve a lockcount to handle recursive
This is actually what David Abrahams says his problem is: he wants a
recursive lock. Of course, for efficiency, it might be better to use
platform recursive locks where avaiable.
> Thinking further about this, for this to work cleanly I think Python
> would have to allow only one interpreter per process.
If you are going for TLS, this is not strictly necessary: every
interpreter could maintain its own TLS key. Of course, in cases where
you want to acquire a thread, this would not be helpful, as you then
often don't have an interpreter, either, so you could not find out
what the TLS key is.
> I never use multiple interpreters, so I'm not quite sure what
> they're used for,
People think they can use them to have several independent execution
environments. This is not true, though, as extension modules don't
keep a per-interpreter state (bug plain global variables); several
other global tables exist.
I believe multiple interpreters where added to silence the recurring
request to have multiple interpreters, and either not knowing or
deliberately ignoring that people would not get what they think they
> but I wonder if the need for them could be eliminated by providing a
> new built-in type (sort of like RExec without the security overhead)
> which would initialize itself by doing the stuff that
> Py_NewInterpreter does to get a new copy of the global data space
> and which would provide methods for executing code in that copy of
> the data space.
This would share the quality of RExec, though: it sort-of works, but
if you dig long enough, you'll peek holes into it easily. Unlike
RExec, those holes can't be mended in the Python core itself - you'll
have to patch loads of third-party extension modules itself. Of
course, it wouldn't be worse than Py_NewInterpreter, except that it
would have to make claims that it couldn't fulfill.
More information about the Python-list