Multiprocessing, join(), and crashed processes
Cameron Simpson
cs at cskk.id.au
Thu Feb 6 00:53:23 EST 2020
On 05Feb2020 15:48, Israel Brewster <ijbrewster at alaska.edu> wrote:
>In a number of places I have constructs where I launch several
>processes using the multiprocessing library, then loop through said
>processes calling join() on each one to wait until they are all
>complete. In general, this works well, with the *apparent* exception of
>if something causes one of the child processes to crash (not throw an
>exception, actually crash). In that event, it appears that the call to
>join() hangs indefinitely. How can I best handle this? Should I put a
>timeout on the join, and put it in a loop, such that every 5 seconds or
>so it breaks, checks to see if the process is still actually running,
>and if so goes back and calls join again? Or is there a better option
>to say “wait until this process is done, however long that may be,
>unless it crashes”?
What's your platform/OS? And what does "crash" mean, precisely?
If a subprocess exits, join() should terminate.
If the subprocess _hangs_, then join will not see it exit, because it
hasn't. And join will hang.
You'll need to define what happens when your subprocesses crash.
Cheers,
Cameron Simpson <cs at cskk.id.au>
More information about the Python-list
mailing list