[issue6721] Locks in python standard library should be sanitized on fork

Wed May 4 22:58:57 CEST 2011

Charles-François Natali <neologix at free.fr> added the comment:

>> - what's current_thread_id ? If it's thread_get_ident (pthread_self),
>> since TID is not guaranteed to be inherited across fork, this won't
>> work
>
> Ouch, then the approach I'm proposing is probably doomed.
>

Well, it works on Linux with NPTL, but I'm not sure at all it holds
for other implementations (pthread_t it's only meaningful within the
same process).
But I'm not sure it's really the "killer" point: PID with linuxthreads
and lock being acquired by a second thread before the main thread
releases it in the child process also look like serious problems.

> Well, this means indeed that *some* locks can be handled, but not all of
> them and not automatically, right?
> Also, how would you propose they be dealt with in practice? How do they
> get registered, and how does the reinitialization happen?

When a lock object is allocated in Modules/threadmodule.c
(PyThread_allocate_lock/newlockobject), add the underlying lock
(self->lock_lock) to a linked list (since it's called with the GIL
held, we don't need to protect the linked list from concurrent
access). Each thread implementation (thread_pthread.h, thread_nt.h)
would provide a new PyThread_reinit_lock function that would do the
right thing (pthread_mutex_destroy/init, sem_destroy/init, etc).
Modules/threadmodule.c would provide a new PyThread_ReInitLocks that
would walk through the linked list and call PyThread_reinit_lock for
each lock.
PyOS_AfterFork would call this PyThread_ReInitLocks right after fork.
This would have the advantage of being consistent with what's already
done to reinit the TLS key and the import lock. So, we guarantee to be
in a consistent and usable state when PyOS_AfterFork returns. Also,
it's somewhat simpler because we're sure that at that point only one
thread is running (once again, no need to protect the linked-list
walk).
I don't think that the performance impact would be noticable (I know
it's O(N) where N is the number of locks), and contrarily to the
automatic approach, this wouldn't penalize every acquire/release.
Of course, this would solve the problem of threading's module locks,
so PyEval_ReInitThreads could be removed, along with threading.py's
_after_fork and _reset_internal_locks.
In short, this would reset every lock held so that they're usable in
the child process, even locks allocated e.g. from
Modules/_io/bufferedio.c.
But this wouldn't allow a lock's state to be inherited across fork for
the main thread (but like I said, I don't think that this makes much
sense anyway, and to my knowledge no implementation makes such a
guarantee - and definitely not POSIX).

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6721>
_______________________________________