[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly
report at bugs.python.org
Fri Aug 27 19:59:47 CEST 2010
Greg Brockman <gdb at mit.edu> added the comment:
Hmm, a few notes. I have a bunch of nitpicks, but those can wait for a later iteration. (Just one style nit: I noticed a few unneeded whitespace changes... please try not to do that, as it makes the patch harder to read.)
- Am I correct that you handle a crashed worker by aborting all running jobs? If so:
- Is this acceptable for your use case? I'm fine with it, but had been under the impression that we would rather this did not happen.
- If you're going to the effort of ACKing, why not record the mapping of tasks to workers so you can be more selective in your termination? Otherwise, what does the ACKing do towards fixing this particular issue?
- I think in the final version you'd need to introduce some interthread locking, because otherwise you're going to have weird race conditions. I haven't thought too hard about whether you can get away with just catching unexpected exceptions, but it's probably better to do the locking.
- I'm getting hangs infrequently enough to make debugging annoying, and I don't have time to track down the bug right now. Why don't you strip out any changes that are not needed (e.g. AFAICT, the ACK logic), make sure there aren't weird race conditions, and if we start converging on a patch that looks right from a high level we can try to make it work on all the corner cases?
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list