[Python-ideas] Membership of infinite iterators

Koos Zevenhoven k7hoven at gmail.com
Wed Oct 18 11:27:56 EDT 2017

On Wed, Oct 18, 2017 at 5:48 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 18 October 2017 at 22:36, Koos Zevenhoven <k7hoven at gmail.com> wrote:
>> On Wed, Oct 18, 2017 at 2:08 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> That one can only be fixed in count() - list already checks
>>> operator.length_hint(), so implementing itertools.count.__length_hint__()
>>> to always raise an exception would be enough to handle the container
>>> constructor case.
>> While that may be a convenient hack to solve some of the cases, maybe
>> it's possible for list(..) etc. to give Ctrl-C a chance every now and then?
>> (Without a noticeable performance penalty, that is.) That would also help
>> with *finite* C-implemented iterables that are just slow to turn into a
>> list.
>> If I'm not mistaken, we're talking about C-implemented functions that
>> iterate over C-implemented iterators. It's not at all obvious to me that
>> it's the iterator that should handle Ctrl-C.
> It isn't, it's the loop's responsibility. The problem is that one of the
> core design assumptions in the CPython interpreter implementation is that
> signals from the operating system get handled by the opcode eval loop in
> the main thread, and Ctrl-C is one of those signals.
> This is why "for x in itertools.cycle(): pass" can be interrupted, while
> "sum(itertools.cycle())" can't: in the latter case, the opcode eval loop
> isn't running, as we're inside a tight loop inside the sum() implementation.
> It's easy to say "Well those loops should all be checking for signals
> then", but I expect folks wouldn't actually like the consequences of doing
> something about it, as:
> 1. It will make those loops slower, due to the extra overhead of checking
> for signals (even the opcode eval loop includes all sorts of tricks to
> avoid actually checking for new signals, since doing so is relatively slow)
> 2. It will make those loops harder to maintain, since the high cost of
> checking for signals means the existing flat loops will need to be replaced
> with nested ones to reduce the per-iteration cost of the more expensive
> checks

Combining points 1 and 2, I don't believe nesting loops is strictly a

> 3. It means making the signal checking even harder to reason about than it
> already is, since even C implemented methods that avoid invoking arbitrary
> Python code could now still end up checking for signals

So you're talking about code that would make a C-implemented Python
iterable of strictly C-implemented Python objects and then pass this to
something C-implemented like list(..) or sum(..), while expecting no Python
code to be run or signals to be checked anywhere while doing it. I'm not
really convinced that such code exists.​ But if such code does exist, it
sounds like the code is heavily dependent on implementation details.

> It's far from being clear to me that making such a change would actually
> be a net improvement, especially when there's an opportunity to mitigate
> the problem by having known-infinite iterators report themselves as such.

​I'm not against that, per se. I just don't think that solves the quite
typical case of *finite* but very tedious or memory-consuming loops that
one might want to break out of. And raising an exception from
.__length_hint__() might ​also break some obscure, but completely valid,
operations on *infinite* iterables.


> Cheers,
> Nick.
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

+ Koos Zevenhoven + http://twitter.com/k7hoven +
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20171018/a7b0e142/attachment.html>

More information about the Python-ideas mailing list