[issue6721] Locks in python standard library should be sanitized on fork
report at bugs.python.org
Mon Aug 29 21:04:01 CEST 2011
sbt <shibturn at gmail.com> added the comment:
multiprocessing.util already has register_after_fork() which it uses for cleaning up certain things when a new process (launched by multiprocessing) is starting. This is very similar to the proposed atfork mechanism.
Multiprocessing assumes that it is always safe to delete lock objects. If reinit_locks.diff is committed then I guess this won't be a problem.
I will try to go through multiprocessing's use of threads:
Queue's have a feeder thread which pushes objects in to the underlying pipe as soon as possible. The state which can be modified by this thread is a threading.Condition object and a collections.deque buffer. Both of these are replaced by fresh copies by the after-fork mechanism.
However, because objects in the buffer may have __del__ methods or weakref callbacks associated, arbitrary code may be run by the background thread if the reference count falls to zero.
Simply pickling the argument of put() before adding it to the buffer fixes that problem -- see the patch for Issue 10886. With this patch I think Queue's use of threads is fork-safe.
If a fork occurs while a pool is running then a forked process will get a copy of the pool object in an inconsistent state -- but that does not matter since trying to use a pool from a forked process will *never* work.
Also, some of a pool's methods support callbacks which can execute arbitrary code in a background thread. This can create inconsistent state in a forked process
As with Queue.put, pool methods should pickle immediately for similar reasons.
I would suggest documenting clearly that a pool should only ever be used or deleted by the process which created it. We can use register_after_fork to make all of a pool's methods raise an error after a fork.
We should also document that callbacks should only be used if no more processes will be forked.
Currently multiprocessing.allow_connection_pickling() does not work because types are registered with ForkingPickler instead of copyreg -- see Issue 4892. However, the code in multiprocessing.reduction uses a background thread to support the transfer of sockets/connections between processes.
If this code is ever resurrected I think the use of register_after_fork makes this safe.
A manager uses a threaded server process. This is not a problem unless you create a user defined manager which forks new processes. The documentation should just say Don't Do That.
I think multiprocessing's threading issues are all fixable.
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list