[Python-Dev] pthreads, fork, import, and execvp

Gregory P. Smith greg at krypto.org
Sat Jul 25 19:25:35 CEST 2009


On Thu, Jul 23, 2009 at 4:28 PM, Thomas Wouters<thomas at python.org> wrote:
>
> So attached (and at http://codereview.appspot.com/96125/show ) is a
> preliminary fix, correcting the problem with os.fork(), os.forkpty() and
> os.fork1(). This doesn't expose a general API for C code to use, for two
> reasons: it's not easy, and I need this fix more than I need the API change
> :-) (I actually need this fix myself for Python 2.4, but it applies fairly
> cleanly.)

This looks good to me.

Your idea of making this an API called a 'fork lock' or something
sounds good (though I think it needs a better name.  PyBeforeFork &
PyAfterFork?).  The subprocess module, for example, disables garbage
collection before forking and restores it afterwards to avoid
http://bugs.python.org/issue1336.  That type of thing could also be
done in such a function.

Related to the above subprocess fork + gc bug.. It'd be nice for
CPython to have code that does the fork+misc twiddling+exec all in one
C call without needing to execute Python code in the child process
prior to the exec().  Most of the inner body of subprocess's
_execute_child() method could be done that way.  (with the obvious
exception of the preexec_fn)

>
> To fix the mutex-across-fork problem correctly we should really acquire
> three locks (the import lock, the GIL and the TLS lock, in that order.) The
> import lock is re-entrant, and the TLS lock is tightly confined to the
> thread-local-storage lookup functions, but the GIL is neither re-entrant nor
> inspectable. That means we can't use pthread_atfork (we can't tell whether
> we have the GIL already or not, right before the fork), nor can we provide a
> convenient API for extension modules to use, really. The best we can do is
> provide three functions, pthread_atfork-style: a 'prepare' function, an
> 'after-fork-in-child' function, and an 'after-fork-in-parent' function. The
> 'prepare' function would start by releasing the GIL, then acquire the import
> lock, the GIL and the TLS lock in that order. It would also record
> *somewhere* the thread_ident of the current thread. The 'in-parent' function
> would simply release the TLS lock and the import lock, and the 'in-child'
> would release those locks, call the threading._at_fork() function, and fix
> up the TLS data, using the recorded thread ident to do lookups. The
> 'in-child' function would replace the current PyOS_AfterFork() function
> (which blindly reinitializes the GIL and the TLS lock, and calls
> threading._at_fork().)
>
> Alternatively we could do what we do now, which is to ignore the fact that
> thread idents may change by fork() and thus that thread-local data may
> disappear, in which case the 'in-child' function could do a little less
> work. I'm suitably scared and disgusted of the TLS implementation, the
> threading implementations we support (including the pthread one) and the way
> we blindly type-pun a pthread_t to a long, that I'm not sure I want to try
> and build something correct and reliable on top of it.
>
> --
> Thomas Wouters <thomas at python.org>
>
> Hi! I'm a .signature virus! copy me into your .signature file to help me
> spread!
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>
>


More information about the Python-Dev mailing list