Add subprocess.Popen suspend() and resume()
Hello, I've been having these 2 implemented in psutil for a long time. On POSIX these are convenience functions using os.kill() + SIGSTOP / SIGCONT (the same as CTRL+Z / "fg"). On Windows they use undocumented NtSuspendProcess and NtResumeProcess Windows APIs available since XP. The same approach is used by ProcessHacker and - I suppose - pssupend.exe, both from Sysinternals team. It must be noted that there are 3 different ways to do this on Windows (https://stackoverflow.com/a/11010508/376587) but NtSuspend/ResumeProcess appears to be the best choice. Possible use case: <<PsSuspend lets you suspend processes on the local or a remote system, which is desirable in cases where a process is consuming a resource (e.g. network, CPU or disk) that you want to allow different processes to use. Rather than kill the process that's consuming the resource, suspending permits you to let it continue operation at some later point in time.>> Thoughts? -- Giampaolo - http://grodola.blogspot.com
Seems reasonable to me. Regards Antoine. On Mon, 18 Mar 2019 16:41:34 +0100 "Giampaolo Rodola'" <g.rodola@gmail.com> wrote:
Hello, I've been having these 2 implemented in psutil for a long time. On POSIX these are convenience functions using os.kill() + SIGSTOP / SIGCONT (the same as CTRL+Z / "fg"). On Windows they use undocumented NtSuspendProcess and NtResumeProcess Windows APIs available since XP. The same approach is used by ProcessHacker and - I suppose - pssupend.exe, both from Sysinternals team. It must be noted that there are 3 different ways to do this on Windows (https://stackoverflow.com/a/11010508/376587) but NtSuspend/ResumeProcess appears to be the best choice. Possible use case:
<<PsSuspend lets you suspend processes on the local or a remote system, which is desirable in cases where a process is consuming a resource (e.g. network, CPU or disk) that you want to allow different processes to use. Rather than kill the process that's consuming the resource, suspending permits you to let it continue operation at some later point in time.>>
Thoughts?
On 3/18/19, Giampaolo Rodola' <g.rodola@gmail.com> wrote:
I've been having these 2 implemented in psutil for a long time. On POSIX these are convenience functions using os.kill() + SIGSTOP / SIGCONT (the same as CTRL+Z / "fg"). On Windows they use undocumented NtSuspendProcess and NtResumeProcess Windows APIs available since XP.
Currently, Windows Python only calls documented C runtime-library and Windows API functions. It doesn't directly call NT runtime-library and system functions. Maybe it could in the case of documented functions, but calling undocumented functions in the standard library should be avoided. Unfortunately, without NtSuspendProcess and NtResumeProcess, I don't see a way to reliably implement this feature for Windows. I'm CC'ing Steve Dower. He might say it's okay in this case, or know of another approach. DebugActiveProcess, the other simple approach mentioned in the linked SO answer [1], is unreliable and has the wrong semantics. A process only has a single debug port, so DebugActiveProcess will fail the PID as an invalid parameter if another debugger is already attached to the process. (The underlying NT call, DbgUiDebugActiveProcess, fails with STATUS_PORT_ALREADY_SET.) Additionally, the semantics that I expect here, at least for Windows, is that each call to suspend() will require a corresponding call to resume(), since it's incrementing the suspend count on the threads; however, a debugger can't reattach to the same process. Also, if the Python process exits while it's attached as a debugger, the system will terminate the debugee as well, unless we call DebugSetProcessKillOnExit(0), but that interferes with the Python process acting as a debugger normally, as does this entire wonky idea. Also, the debugging system creates a thread in the debugee that calls NT DbgUiRemoteBreakin, which executes a breakpoint. This thread is waiting, but it's not suspended, so the process will never actually appear as suspended in Task Manager or Process Explorer. That leaves enumerating threads in a snapshot and calling OpenThread and SuspendThread on each thread that's associated with the process. In comparison, let's take an abridged look at the guts of NtSuspendProcess. nt!NtSuspendProcess: ... mov r8,qword ptr [nt!PsProcessType] ... call nt!ObpReferenceObjectByHandleWithTag ... call nt!PsSuspendProcess ... mov ebx,eax call nt!ObfDereferenceObjectWithTag mov eax,ebx ... ret nt!PsSuspendProcess: ... call nt!ExAcquireRundownProtection cmp al,1 jne nt!PsSuspendProcess+0x74 ... call nt!PsGetNextProcessThread xor ebx,ebx jmp nt!PsSuspendProcess+0x62 nt!PsSuspendProcess+0x4d: ... call nt!PsSuspendThread ... call nt!PsGetNextProcessThread nt!PsSuspendProcess+0x62: ... test rax,rax jne nt!PsSuspendProcess+0x4d ... call nt!ExReleaseRundownProtection jmp nt!PsSuspendProcess+0x79 nt!PsSuspendProcess+0x74: mov ebx,0C000010Ah (STATUS_PROCESS_IS_TERMINATING) nt!PsSuspendProcess+0x79: ... mov eax,ebx ... ret This code repeatedly calls PsGetNextProcessThread to walk the non-terminated threads of the process in creation order (based on a linked list in the process object) and suspends each thread via PsSuspendThread. In contrast, a Tool-Help thread snapshot is unreliable since it won't include threads created after the snapshot is created. The alternative is to use a different undocumented system call, NtGetNextThread [2], which is implemented via PsGetNextProcessThread. But that's slightly worse than calling NtSuspendProcess. [1]: https://stackoverflow.com/a/11010508 [2]: https://github.com/processhacker/processhacker/blob/v2.39/phnt/include/ntpsa...
I don't think this belongs in subprocess. It isn't related to processes creation. A module on PyPI with the Windows code would make more sense. On Wed, Mar 20, 2019 at 3:19 PM eryk sun <eryksun@gmail.com> wrote:
On 3/18/19, Giampaolo Rodola' <g.rodola@gmail.com> wrote:
I've been having these 2 implemented in psutil for a long time. On POSIX these are convenience functions using os.kill() + SIGSTOP / SIGCONT (the same as CTRL+Z / "fg"). On Windows they use undocumented NtSuspendProcess and NtResumeProcess Windows APIs available since XP.
Currently, Windows Python only calls documented C runtime-library and Windows API functions. It doesn't directly call NT runtime-library and system functions. Maybe it could in the case of documented functions, but calling undocumented functions in the standard library should be avoided. Unfortunately, without NtSuspendProcess and NtResumeProcess, I don't see a way to reliably implement this feature for Windows. I'm CC'ing Steve Dower. He might say it's okay in this case, or know of another approach.
DebugActiveProcess, the other simple approach mentioned in the linked SO answer [1], is unreliable and has the wrong semantics. A process only has a single debug port, so DebugActiveProcess will fail the PID as an invalid parameter if another debugger is already attached to the process. (The underlying NT call, DbgUiDebugActiveProcess, fails with STATUS_PORT_ALREADY_SET.) Additionally, the semantics that I expect here, at least for Windows, is that each call to suspend() will require a corresponding call to resume(), since it's incrementing the suspend count on the threads; however, a debugger can't reattach to the same process. Also, if the Python process exits while it's attached as a debugger, the system will terminate the debugee as well, unless we call DebugSetProcessKillOnExit(0), but that interferes with the Python process acting as a debugger normally, as does this entire wonky idea. Also, the debugging system creates a thread in the debugee that calls NT DbgUiRemoteBreakin, which executes a breakpoint. This thread is waiting, but it's not suspended, so the process will never actually appear as suspended in Task Manager or Process Explorer.
That leaves enumerating threads in a snapshot and calling OpenThread and SuspendThread on each thread that's associated with the process. In comparison, let's take an abridged look at the guts of NtSuspendProcess.
nt!NtSuspendProcess: ... mov r8,qword ptr [nt!PsProcessType] ... call nt!ObpReferenceObjectByHandleWithTag ... call nt!PsSuspendProcess ... mov ebx,eax call nt!ObfDereferenceObjectWithTag mov eax,ebx ... ret
nt!PsSuspendProcess: ... call nt!ExAcquireRundownProtection cmp al,1 jne nt!PsSuspendProcess+0x74 ... call nt!PsGetNextProcessThread xor ebx,ebx jmp nt!PsSuspendProcess+0x62
nt!PsSuspendProcess+0x4d: ... call nt!PsSuspendThread ... call nt!PsGetNextProcessThread
nt!PsSuspendProcess+0x62: ... test rax,rax jne nt!PsSuspendProcess+0x4d ... call nt!ExReleaseRundownProtection jmp nt!PsSuspendProcess+0x79
nt!PsSuspendProcess+0x74: mov ebx,0C000010Ah (STATUS_PROCESS_IS_TERMINATING)
nt!PsSuspendProcess+0x79: ... mov eax,ebx ... ret
This code repeatedly calls PsGetNextProcessThread to walk the non-terminated threads of the process in creation order (based on a linked list in the process object) and suspends each thread via PsSuspendThread. In contrast, a Tool-Help thread snapshot is unreliable since it won't include threads created after the snapshot is created. The alternative is to use a different undocumented system call, NtGetNextThread [2], which is implemented via PsGetNextProcessThread. But that's slightly worse than calling NtSuspendProcess.
[1]: https://stackoverflow.com/a/11010508 [2]: https://github.com/processhacker/processhacker/blob/v2.39/phnt/include/ntpsa... _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Mar 20, 2019 at 11:19 PM eryk sun <eryksun@gmail.com> wrote:
On 3/18/19, Giampaolo Rodola' <g.rodola@gmail.com> wrote:
I've been having these 2 implemented in psutil for a long time. On POSIX these are convenience functions using os.kill() + SIGSTOP / SIGCONT (the same as CTRL+Z / "fg"). On Windows they use undocumented NtSuspendProcess and NtResumeProcess Windows APIs available since XP.
Currently, Windows Python only calls documented C runtime-library and Windows API functions. It doesn't directly call NT runtime-library and system functions. Maybe it could in the case of documented functions, but calling undocumented functions in the standard library should be avoided. Unfortunately, without NtSuspendProcess and NtResumeProcess, I don't see a way to reliably implement this feature for Windows. I'm CC'ing Steve Dower. He might say it's okay in this case, or know of another approach.
DebugActiveProcess, the other simple approach mentioned in the linked SO answer [1], is unreliable and has the wrong semantics. A process only has a single debug port, so DebugActiveProcess will fail the PID as an invalid parameter if another debugger is already attached to the process. (The underlying NT call, DbgUiDebugActiveProcess, fails with STATUS_PORT_ALREADY_SET.) Additionally, the semantics that I expect here, at least for Windows, is that each call to suspend() will require a corresponding call to resume(), since it's incrementing the suspend count on the threads; however, a debugger can't reattach to the same process. Also, if the Python process exits while it's attached as a debugger, the system will terminate the debugee as well, unless we call DebugSetProcessKillOnExit(0), but that interferes with the Python process acting as a debugger normally, as does this entire wonky idea. Also, the debugging system creates a thread in the debugee that calls NT DbgUiRemoteBreakin, which executes a breakpoint. This thread is waiting, but it's not suspended, so the process will never actually appear as suspended in Task Manager or Process Explorer.
That leaves enumerating threads in a snapshot and calling OpenThread and SuspendThread on each thread that's associated with the process. In comparison, let's take an abridged look at the guts of NtSuspendProcess.
nt!NtSuspendProcess: ... mov r8,qword ptr [nt!PsProcessType] ... call nt!ObpReferenceObjectByHandleWithTag ... call nt!PsSuspendProcess ... mov ebx,eax call nt!ObfDereferenceObjectWithTag mov eax,ebx ... ret
nt!PsSuspendProcess: ... call nt!ExAcquireRundownProtection cmp al,1 jne nt!PsSuspendProcess+0x74 ... call nt!PsGetNextProcessThread xor ebx,ebx jmp nt!PsSuspendProcess+0x62
nt!PsSuspendProcess+0x4d: ... call nt!PsSuspendThread ... call nt!PsGetNextProcessThread
nt!PsSuspendProcess+0x62: ... test rax,rax jne nt!PsSuspendProcess+0x4d ... call nt!ExReleaseRundownProtection jmp nt!PsSuspendProcess+0x79
nt!PsSuspendProcess+0x74: mov ebx,0C000010Ah (STATUS_PROCESS_IS_TERMINATING)
nt!PsSuspendProcess+0x79: ... mov eax,ebx ... ret
Thanks for chiming in with useful info as usual. I agree with your rationale after all. I've been dealing with undocumented Windows APIs in psutil for a long time and they have always been in a sort of grey area where despite they stayed "stable" since forever, the lack of an official stand from Microsoft probably makes this addition inappropriate for the stdlib.
This code repeatedly calls PsGetNextProcessThread to walk the non-terminated threads of the process in creation order (based on a linked list in the process object) and suspends each thread via PsSuspendThread. In contrast, a Tool-Help thread snapshot is unreliable since it won't include threads created after the snapshot is created. The alternative is to use a different undocumented system call, NtGetNextThread [2], which is implemented via PsGetNextProcessThread. But that's slightly worse than calling NtSuspendProcess.
[1]: https://stackoverflow.com/a/11010508 [2]: https://github.com/processhacker/processhacker/blob/v2.39/phnt/include/ntpsa...
FWIW older psutil versions relied on Thread32Next / OpenThread / SuspendThread / ResumeThread, which appear similar to these Ps* counterparts (and I assume have the same drawbacks). -- Giampaolo - http://grodola.blogspot.com
On 3/24/19, Giampaolo Rodola' <g.rodola@gmail.com> wrote:
On Wed, Mar 20, 2019 at 11:19 PM eryk sun <eryksun@gmail.com> wrote:
This code repeatedly calls PsGetNextProcessThread to walk the non-terminated threads of the process in creation order (based on a linked list in the process object) and suspends each thread via PsSuspendThread. In contrast, a Tool-Help thread snapshot is unreliable since it won't include threads created after the snapshot is created. The alternative is to use a different undocumented system call, NtGetNextThread [2], which is implemented via PsGetNextProcessThread. But that's slightly worse than calling NtSuspendProcess.
[1]: https://stackoverflow.com/a/11010508 [2]: https://github.com/processhacker/processhacker/blob/v2.39/ phnt/include/ntpsapi.h#L848
FWIW older psutil versions relied on Thread32Next / OpenThread / SuspendThread / ResumeThread, which appear similar to these Ps* counterparts (and I assume have the same drawbacks).
This is the toolhelp snapshot I was talking about, which is an unreliable way to pause a process since it doesn't include threads created after the snapshot. For TH32CS_SNAPTHREAD, it's based on calling NtQuerySystemInformation: SystemProcessInformation to take a snapshot of all running processes and threads at the time. This buffer gets written to a shared section, and the section handle is returned as the snapshot handle. Thread32First and Thread32Next are called to walk the buffer a record at a time by temporarily mapping the section with NtMapViewOfSection and NtUnmapViewOfSection. In contrast, NtSuspendProcess is based on PsGetNextProcessThread, which walks a linked list of the non-terminated threads in the process. Unlike a snapshot, this won't miss threads created after we start, since new threads are appended to the list. To implement this in user mode with SuspendThread would require the NtGetNextThread system call that's implemented via PsGetNextProcessThread. But that's just trading one undocumented system call for another at the expense of a more complicated implementation.
participants (4)
-
Antoine Pitrou
-
eryk sun
-
Giampaolo Rodola'
-
Gregory P. Smith