[New-bugs-announce] [issue42558] waitpid/waitid race caused by change to Popen.send_signal in Python 3.9

Jack O'Connor report at bugs.python.org
Thu Dec 3 12:05:29 EST 2020


New submission from Jack O'Connor <oconnor663 at gmail.com>:

In Python 3.9, Popen.send_signal() was changed to call Popen.poll() internally before signaling. (Tracking bug: https://bugs.python.org/issue38630.) This is a best-effort check for the famous kill/wait race condition. However, because this can now reap an already-exited child process as a side effect, it can cause previously working programs to crash. Here's a simple example:

```
import os
import subprocess
import time

child = subprocess.Popen(["true"])
time.sleep(1)
child.kill()
os.waitpid(child.pid, 0)
```

The program above exits cleanly in Python 3.8 but crashes with ChildProcessError in Python 3.9. It's a race against child process exit, so in practice (without the sleep) it's a heisenbug.

There's a deeper race here that's harder to demonstrate in an example, but parallel to the original wait/kill issue: If the child PID happens to be reused by another parent thread in this same process, the call to waitpid might *succeed* and reap the unrelated child process. That would export the crash to that thread, and possibly create more kill/wait races.

In short, the question of when exactly a child process is reaped is important for correct signaling on Unix, and changing that behavior can break programs in confusing ways. This change affected the Duct library, and I might not've caught it if not for a lucky failing doctest: https://github.com/oconnor663/duct.py/commit/5dfae70cc9481051c5e53da0c48d9efa8ff71507

I haven't searched for more instances of this bug in the wild, but one way to find them would be to look for code that calls both os.waitpid/waitid and also Popen.send_signal/kill/terminate. Duct found itself in this position because it was using waitid(WNOWAIT) on Unix only, to solve this same race condition, and also using Popen.kill on both Unix and Windows.

----------
messages: 382429
nosy: oconnor663
priority: normal
severity: normal
status: open
title: waitpid/waitid race caused by change to Popen.send_signal in Python 3.9
type: crash
versions: Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue42558>
_______________________________________


More information about the New-bugs-announce mailing list