Re: [Python-Dev] pthreads, fork, import, and execvp

9 Sep 2009

      On Wed, Sep 9, 2009 at 20:19, Gregory P. Smith  wrote:
...
On Wed, Sep 9, 2009 at 9:07 AM, Thomas Wouters wrote:
...
On Sat, Jul 25, 2009 at 19:25, Gregory P. Smith  wrote:
...
On Thu, Jul 23, 2009 at 4:28 PM, Thomas Wouters
...
...
...
So attached (and at http://codereview.appspot.com/96125/show ) is a
preliminary fix, correcting the problem with os.fork(), os.forkpty()
and
...
os.fork1(). This doesn't expose a general API for C code to use, for
two
reasons: it's not easy, and I need this fix more than I need the API
change
:-) (I actually need this fix myself for Python 2.4, but it applies
fairly
cleanly.)
This looks good to me.
Anyone else want to take a look at this before I check it in? I updated
...
patch (in Rietveld) to contain some documentation about the hazards of
mixing fork and threads, which is the best we can do at the moment, at
least
without seriously overhauling the threading APIs (which, granted, is not
that bad an idea, considering the mess they're in.) I've now thoroughly
tested the patch, and for most platforms it's strictly better. On AIX it
*may* behave differently (possibly 'incorrectly' for specific cases) if
something other than os.fork() calls the C fork() and calls
PyOS_AfterFork(), since on AIX it used to nuke the thread lock. *I* think
the new behaviour (not nuking the lock) is the correct thing to do, but
since most places that release the import lock don't bother to check if
...
lock was even held, the old behaviour may have been succesfully masking
wrote:
the
the
the
...
bug on AIX systems.
Perhaps for the backport to 2.6 (which I think is in order, and also in
accordance with the guidelines) I should leave the AIX workaround in?
Anyone
think it should not be removed from 3.x/2.7?
...
Your idea of making this an API called a 'fork lock' or something
sounds good (though I think it needs a better name.  PyBeforeFork &
PyAfterFork?).  The subprocess module, for example, disables garbage
collection before forking and restores it afterwards to avoid
http://bugs.python.org/issue1336.  That type of thing could also be
done in such a function.
Unfortunately it's rather hard to make those functions work correctly
with
the current API -- we can't provide functions you can just use as
arguments
to pthread_atfork because the global interpreter lock is not re-entrant
and
we have no way of testing whether the current thread holds the GIL. I
also
get the creepy-crawlies when I look at the various thread_*
implementations
and see the horribly unsafe things they do (and also, for instance, the
PendingCall stuff in ceval.c :S) Unfortunately there's no good way to fix
these things without breaking API compatibility, let alone ABI
compatibility.
Take a look at http://code.google.com/p/python-atfork/ which I created
to address the general fork+threading with locks held causing
deadlocks issue with many standard library modules.  Currently it only
patches the logging module but I intend to support others.  I want to
get an atfork mechanism into 2.7/3.2 along with every lock in the
standard library making proper use of it.
See also http://bugs.python.org/issue6721
I make no attempt to deal with C-level locks, only those acquired from
python.  It doesn't use pthread_atfork but instead models its behavior
after that and wraps os.fork and os.forkpty so that they call the
registered atfork methods as appropriate.
Well, sure, the *Python code* side of the problem is fixable, thanks to
Python being so awesome ;-P I'm strictly thinking of applications embedding
Python (or even extending it and calling into code that doesn't consider
Python.) Your python-atfork project looks like a terribly good idea, but it
won't fix the embedding/extending issues (nor is it intended to, right?)

-- 
Thomas Wouters 

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!