[Python-Dev] Extension modules, Threading, and the GIL

David Abrahams dave@boost-consulting.com
Wed, 08 Jan 2003 10:40:33 -0500


"Mark Hammond" <mhammond@skippinet.com.au> writes:

> My goal:
>
> For a multi-threaded application (generally this will be a larger
> app embedding Python, but that is irrelevant), make it reasonably
> easy to accomplish 2 things:
>
> 1) Allow "arbitrary" threads (that is, threads never before seen by
> Python) to acquire the resources necessary to call the Python C API.
>
> 2) Allow Python extensions to be written which support (1) above.

  3) Allow "arbitrary" threads to acquire the resources necessary to
     call the Python C API, even if they already have those resources,
     and to later release them if they did not have those resources.

> Currently (2) is covered by Py_BEGIN_ALLOW_THREADS, except that it
> is kinda like only having a hammer in your toolbox <wink>.  I assert
> that 2) could actually be split into discrete goals:

I'm going to ask some questions just to make sure your terminology is
clear to me:

> 2.1) Extension functions that expect to take a lot of time, but
> generally have no thread-state considerations.  This includes
> sleep(), all IO functions, and many others.  This is exactly what
> Py_BEGIN_ALLOW_THREADS was designed for.

In other words, functions which will not call back into the Python API?

> 2.2) Extensions that *may* take a little time, but more to the
> point, may directly and synchronously trigger callbacks.  

By "callbacks", do you mean "functions which (may) use the
Python C API?"

> That is, it is not expected that much time will be spent outside of
> Python, but rather that Python will be re-entered.  I can concede
> that functions that may trigger asynch callbacks need no special
> handling here, as the normal Python thread switch mechanism will
> ensure correct their dispatch.

By "trigger asynch callbacks" do you mean, "cause a callback to occur
on a different thread?"

> Currently 2.1 and 2.2 are handled the same way, but this need not be
> the case.  Currently 2.2 is only supported by *always* giving up the
> lock, and at each entry point *always* re-acquiring it.  This is
> obviously wasteful if indeed the same thread immediately re-enters -
> hence we are here with a request for "how do I tell if I have the
> lock?".  

Yep, that pinpoints my problem.

> Combine this with the easily stated but tricky to implement (1) and
> no one understands it at all <frown>
>
> I also propose that we restrict this to applications that intend to
> use a single "PyInterpreterState" - if you truly want multiple
> threads running in multiple interpreters (and good luck to you - I'm
> not aware anyone has ever actually done it <wink>) then you are on
> your own.

Fine with me ;-).  I think eventually we'll need to come up with a
more precise definition of exactly when "you're on your own", but for
now that'll do.

> Are these goals a reasonable starting point?  This describes all my
> venturing into this area.

Sounds about right to me.

-- 
                       David Abrahams
   dave@boost-consulting.com * http://www.boost-consulting.com
Boost support, enhancements, training, and commercial distribution