On 19 October 2017 at 08:34, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Nick Coghlan wrote:
since breaking up the current single level loops as nested loops would be a pre-requisite for allowing these APIs to check for signals while they're running while keeping the per-iteration overhead low

Is there really much overhead? Isn't it just checking a flag?

It's checking an atomically updated flag, so it forces CPU cache synchronisation, which means you don't want to be doing it on every iteration of a low level loop.

However, reviewing Serhiy's PR reminded me that PyErr_CheckSignals() already encapsulates the "Should this thread even be checking for signals in the first place?" logic, which means the code change to make the itertools iterators inherently interruptible with Ctrl-C is much smaller than I thought it would be. That approach is also clearly safe from an exception handling point of view, since all consumer loops already need to cope with the fact that itr.__next__() may raise arbitrary exceptions (including KeyboardInterrupt).

So that change alone already offers a notable improvement, and combining it with a __length_hint__() implementation that keeps container constructors from even starting to iterate would go even further towards making the infinite iterators more user friendly.

Similar signal checking changes to the consumer loops would also be possible, but I don't think that's an either/or decision: changing the iterators means they'll be interruptible for any consumer, while changing the consumers would make them interruptible for any iterator, and having checks in both the producer & the consumer merely means that you'll be checking for signals twice every 65k iterations, rather than once.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia