Interrupt thread.join() with Ctrl-C / KeyboardInterrupt on Windows
Hi all, Is it possible that thread.join() cannot be interrupted on Windows, while it can be on Linux? Would this be a bug, or is it by design? import threading, time def wait(): time.sleep(1000) t = threading.Thread(target=wait) t.start() t.join() # Press Control-C now. It stops on Linux, while it hangs on Windows. Tested on Python 3.6. Thanks, Jonathan
On Tue, Aug 8, 2017 at 2:54 AM, Jonathan Slenders <jonathan@slenders.be> wrote:
Hi all,
Is it possible that thread.join() cannot be interrupted on Windows, while it can be on Linux? Would this be a bug, or is it by design?
import threading, time def wait(): time.sleep(1000) t = threading.Thread(target=wait) t.start() t.join() # Press Control-C now. It stops on Linux, while it hangs on Windows.
This comes down to a difference in how the Linux and Windows low-level APIs handle control-C and blocking functions: on Linux, the default is that any low-level blocking function can be interrupted by a control-C or any other random signal, and it's the calling code's job to check for this and restart it if necessary. This is annoying because it means that every low-level function call inside the Python interpreter has to have a bunch of boilerplate to detect this and retry, but OTOH it means that control-C automatically works in (almost) all cases. On Windows, they made the opposite decision: low-level blocking functions are never automatically interrupted by control-C. It's a reasonable design choice. The advantage is that sloppily written programs tend to work better -- on Linux you kind of *have* to put a retry loop around *every* low level call or your program will suffer weird random bugs, and on Windows you don't. But for carefully written programs like CPython this is actually pretty annoying, because if you *do* want to wake up on a control-C, then on Windows that has to be laboriously implemented on a case-by-case basis for each blocking function, and often this requires some kind of cleverness or is actually impossible, depending on what function you want to interrupt. At least on Linux the retry loop is always the same. The end result is that on Windows, control-C almost never works to wake up a blocked Python process, with a few special exceptions where someone did the work to implement this. On Python 2 the only functions that have this implemented are time.sleep() and multiprocessing.Semaphore.acquire; on Python 3 there are a few more (you can grep the source for _PyOS_SigintEvent to find them), but Thread.join isn't one of them. It looks like Thread.join ultimately ends up blocking in Python/thread_nt.h:EnterNonRecursiveMutex, which has a maze of #ifdefs behind it -- I think there are 3 different implementation you might end up with, depending on how CPython was built? Two of them seem to ultimately block in WaitForSingleObject, which would be easy to adapt to handle control-C. Unfortunately I think the implementation that actually gets used on modern systems is the one that blocks in SleepConditionVariableSRW, and I don't see any easy way for a control-C to interrupt that. But maybe I'm missing something -- I'm not a Windows expert. -n -- Nathaniel J. Smith -- https://vorpus.org
Thank you Nathaniel for the response! Really interesting and helpful. 2017-08-08 20:51 GMT+02:00 Nathaniel Smith <njs@pobox.com>:
On Tue, Aug 8, 2017 at 2:54 AM, Jonathan Slenders <jonathan@slenders.be> wrote:
Hi all,
Is it possible that thread.join() cannot be interrupted on Windows, while it can be on Linux? Would this be a bug, or is it by design?
import threading, time def wait(): time.sleep(1000) t = threading.Thread(target=wait) t.start() t.join() # Press Control-C now. It stops on Linux, while it hangs on Windows.
This comes down to a difference in how the Linux and Windows low-level APIs handle control-C and blocking functions: on Linux, the default is that any low-level blocking function can be interrupted by a control-C or any other random signal, and it's the calling code's job to check for this and restart it if necessary. This is annoying because it means that every low-level function call inside the Python interpreter has to have a bunch of boilerplate to detect this and retry, but OTOH it means that control-C automatically works in (almost) all cases. On Windows, they made the opposite decision: low-level blocking functions are never automatically interrupted by control-C. It's a reasonable design choice. The advantage is that sloppily written programs tend to work better -- on Linux you kind of *have* to put a retry loop around *every* low level call or your program will suffer weird random bugs, and on Windows you don't.
But for carefully written programs like CPython this is actually pretty annoying, because if you *do* want to wake up on a control-C, then on Windows that has to be laboriously implemented on a case-by-case basis for each blocking function, and often this requires some kind of cleverness or is actually impossible, depending on what function you want to interrupt. At least on Linux the retry loop is always the same.
The end result is that on Windows, control-C almost never works to wake up a blocked Python process, with a few special exceptions where someone did the work to implement this. On Python 2 the only functions that have this implemented are time.sleep() and multiprocessing.Semaphore.acquire; on Python 3 there are a few more (you can grep the source for _PyOS_SigintEvent to find them), but Thread.join isn't one of them.
It looks like Thread.join ultimately ends up blocking in Python/thread_nt.h:EnterNonRecursiveMutex, which has a maze of #ifdefs behind it -- I think there are 3 different implementation you might end up with, depending on how CPython was built? Two of them seem to ultimately block in WaitForSingleObject, which would be easy to adapt to handle control-C. Unfortunately I think the implementation that actually gets used on modern systems is the one that blocks in SleepConditionVariableSRW, and I don't see any easy way for a control-C to interrupt that. But maybe I'm missing something -- I'm not a Windows expert.
-n
-- Nathaniel J. Smith -- https://vorpus.org
On 08Aug2017 1151, Nathaniel Smith wrote:
It looks like Thread.join ultimately ends up blocking in Python/thread_nt.h:EnterNonRecursiveMutex, which has a maze of #ifdefs behind it -- I think there are 3 different implementation you might end up with, depending on how CPython was built? Two of them seem to ultimately block in WaitForSingleObject, which would be easy to adapt to handle control-C. Unfortunately I think the implementation that actually gets used on modern systems is the one that blocks in SleepConditionVariableSRW, and I don't see any easy way for a control-C to interrupt that. But maybe I'm missing something -- I'm not a Windows expert.
I'd have to dig back through the recent attempts at changing this, but I believe the SleepConditionVariableSRW path is unused for all versions of Windows. A couple of people (including myself) attempted to enable that code path, but it has some subtle issues that were causing test failures, so we abandoned all the attempts. Though ISTR that someone put in more effort than most of us, but I don't think we've merged it (and if we have, it'd only be in 3.7 at this stage). Cheers, Steve
On Tue, Aug 8, 2017 at 2:29 PM, Steve Dower <steve.dower@python.org> wrote:
On 08Aug2017 1151, Nathaniel Smith wrote:
It looks like Thread.join ultimately ends up blocking in Python/thread_nt.h:EnterNonRecursiveMutex, which has a maze of #ifdefs behind it -- I think there are 3 different implementation you might end up with, depending on how CPython was built? Two of them seem to ultimately block in WaitForSingleObject, which would be easy to adapt to handle control-C. Unfortunately I think the implementation that actually gets used on modern systems is the one that blocks in SleepConditionVariableSRW, and I don't see any easy way for a control-C to interrupt that. But maybe I'm missing something -- I'm not a Windows expert.
I'd have to dig back through the recent attempts at changing this, but I believe the SleepConditionVariableSRW path is unused for all versions of Windows.
A couple of people (including myself) attempted to enable that code path, but it has some subtle issues that were causing test failures, so we abandoned all the attempts. Though ISTR that someone put in more effort than most of us, but I don't think we've merged it (and if we have, it'd only be in 3.7 at this stage).
Ah, you're right -- the comments say it's used on Windows 7 and later, but the code disagrees. Silly me for trusting the comments :-). So it looks like it would actually be fairly straightforward to add control-C interruption support. -n -- Nathaniel J. Smith -- https://vorpus.org
On 08Aug2017 1512, Nathaniel Smith wrote:
On Tue, Aug 8, 2017 at 2:29 PM, Steve Dower <steve.dower@python.org> wrote:
On 08Aug2017 1151, Nathaniel Smith wrote:
It looks like Thread.join ultimately ends up blocking in Python/thread_nt.h:EnterNonRecursiveMutex, which has a maze of #ifdefs behind it -- I think there are 3 different implementation you might end up with, depending on how CPython was built? Two of them seem to ultimately block in WaitForSingleObject, which would be easy to adapt to handle control-C. Unfortunately I think the implementation that actually gets used on modern systems is the one that blocks in SleepConditionVariableSRW, and I don't see any easy way for a control-C to interrupt that. But maybe I'm missing something -- I'm not a Windows expert.
I'd have to dig back through the recent attempts at changing this, but I believe the SleepConditionVariableSRW path is unused for all versions of Windows.
A couple of people (including myself) attempted to enable that code path, but it has some subtle issues that were causing test failures, so we abandoned all the attempts. Though ISTR that someone put in more effort than most of us, but I don't think we've merged it (and if we have, it'd only be in 3.7 at this stage).
Ah, you're right -- the comments say it's used on Windows 7 and later, but the code disagrees. Silly me for trusting the comments :-). So it looks like it would actually be fairly straightforward to add control-C interruption support.
Except we're still hypothesising that the native condition variables will be faster than our emulation. I think until we prove or disprove that with a correct implementation, I'd rather not make a promise that Ctrl+C will work in situations where we depend on it. That's not to say that it isn't possible to continue fixing Ctrl+C handling in targeted locations. But I don't want to guarantee that an exception case like that will always work given there's a chance it may prevent us getting a performance benefit in the normal case. (I'm trying to advise caution, rather than saying it'll never happen.) Cheers, Steve
On Tue, 8 Aug 2017 14:29:41 -0700 Steve Dower <steve.dower@python.org> wrote:
On 08Aug2017 1151, Nathaniel Smith wrote:
It looks like Thread.join ultimately ends up blocking in Python/thread_nt.h:EnterNonRecursiveMutex, which has a maze of #ifdefs behind it -- I think there are 3 different implementation you might end up with, depending on how CPython was built? Two of them seem to ultimately block in WaitForSingleObject, which would be easy to adapt to handle control-C. Unfortunately I think the implementation that actually gets used on modern systems is the one that blocks in SleepConditionVariableSRW, and I don't see any easy way for a control-C to interrupt that. But maybe I'm missing something -- I'm not a Windows expert.
I'd have to dig back through the recent attempts at changing this, but I believe the SleepConditionVariableSRW path is unused for all versions of Windows.
A couple of people (including myself) attempted to enable that code path, but it has some subtle issues that were causing test failures, so we abandoned all the attempts. Though ISTR that someone put in more effort than most of us, but I don't think we've merged it (and if we have, it'd only be in 3.7 at this stage).
For the record, there are issues open for this: - locks not interruptible on Windows: https://bugs.python.org/issue29971 - enable optimized locks on Windows: https://bugs.python.org/issue29871 Having Lock.acquire() be interruptible would be really nice as it's the basis for so many of our synchronization primitives (including Thread.join(), I believe). Regards Antoine.
participants (4)
-
Antoine Pitrou
-
Jonathan Slenders
-
Nathaniel Smith
-
Steve Dower