[Python-Dev] pthreads, fork, import, and execvp

Nick Coghlan ncoghlan at gmail.com
Thu Jul 16 23:07:22 CEST 2009


Thomas Wouters wrote:
> Picking up a rather old discussion... We encountered this bug at Google
> and I'm now "incentivized" to fix it.
> 
> For a short recap: Python has an import lock that prevents more than one
> thread from doing an import at any given time. However, unlike most of
> the locks we have lying around, we don't clear that lock in the child
> after an os.fork(). That means that doing an os.fork() during an import
> means the child process can't do any other imports. It also means that
> doing an os.fork() *while another thread is doing an import* means the
> child process can't do any other imports.
> 
> Since this three-year-old discussion we've added a couple of
> post-fork-cleanups to CPython (the TLS, the threading module's idea of
> active threads, see Modules/signalmodule.c:PyOS_AfterFork) and we
> already do simply discard the memory for other locks held during fork
> (the GIL, see Python/ceval.c:PyEval_ReInitThreads, and the TLS lock in
> Python/thread.c:PyThread_ReInitTLS) -- but not so with the import lock,
> except when the platform is AIX. I don't see any particular reason why
> we aren't doing the same thing to the import lock that we do to the
> other locks, on all platforms. It's a quick fix for a real problem (see
> http://bugs.python.org/issue1590864 and
> http://bugs.python.org/issue1404925 for two bugreports that seem to be
> this very issue.)

+1. Leaving deadlock issues around that people can run into without
doing anything wrong in their application is unfriendly.

> It also seems to me, since we have two or three (depending on how you
> count) uses for it, that we should either add a way to free the old
> locks (to the various threading implementations) or add a way to use
> pthread_atexit portably -- or at least portably across systems with
> fork(). I don't think we should wait with fixing the big issue (making
> threads and fork() being unnecessarily flakey) until we have a good fix
> for the small issue (a tiny bit of a memory leak after a fork() -- in
> the child process only.) Especially since the good fix for the small
> issue might require co-ordination between all the threading
> implementations we have and nobody knows about.

One-off memory leaks for relatively rare operations like fork() don't
bother me all that much, so I agree that having fork() leak another lock
in the child process is preferable to it having a chance to mysteriously
deadlock.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


More information about the Python-Dev mailing list