<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On 18 October 2017 at 03:39, Koos Zevenhoven <span dir="ltr"><<a href="mailto:k7hoven@gmail.com" target="_blank">k7hoven@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span class="gmail-"><div style="font-family:monospace,monospace"><span style="font-family:arial,sans-serif">On Tue, Oct 17, 2017 at 5:26 PM, Serhiy Storchaka </span><span dir="ltr" style="font-family:arial,sans-serif"><<a href="mailto:storchaka@gmail.com" target="_blank">storchaka@gmail.com</a>></span><span style="font-family:arial,sans-serif"> wrote:</span><br></div></span><div class="gmail_extra"><div class="gmail_quote"><span class="gmail-"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">17.10.17 17:06, Nick Coghlan пише:<span><br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Keep in mind we're not talking about a regular loop you can break out of with Ctrl-C here - we're talking about a tight loop inside the interpreter internals that leads to having to kill the whole host process just to get out of it.<br>
</blockquote>
<br></span>
And this is the root of the issue. Just let more tight loops be interruptible with Ctrl-C, and this will fix the more general issue.<div class="gmail-m_-5007911347939893422HOEnZb"><div class="gmail-m_-5007911347939893422h5"><br></div></div></blockquote><div><br></div></span><div style="font-family:monospace,monospace">Not being able to interrupt something with Ctrl-C in the repl or with the interrupt command in Jupyter notebooks is definitely a thing I sometimes encounter. A pity I don't remember when it happens, because I usually forget it very soon after I've restarted the kernel and continued working. But my guess is it's usually not because of an infinite iterator.</div></div></div></div></blockquote><div><br></div><div>Fixing the general case is hard, because the assumption that signals are only checked between interpreter opcodes is a pervasive one throughout the interpreter internals. We certainly *could* redefine affected C APIs as potentially raising KeyboardInterrupt (adjusting the signal management infrastructure accordingly), and if someone actually follows through and implements that some day, then the argument could then be made that given such change, it might be reasonable to drop any a priori guards that we have put in place for particular *detectable* uninterruptible infinite loops.</div><div><br></div><div>However, that's not the design question being discussed in this thread. The design question here is "We have 3 known uninterruptible infinite loops that are easy to detect and prevent. Should we detect and prevent them?". "We shouldn't allow anyone to do this easy thing, because it would be preferable for someone to instead do this hard and complicated thing that nobody is offering to do" isn't a valid design argument in that situation.<br></div><div><br></div><div>And I have a four step check for that which prompts me to say "Yes, we should detect and prevent them":</div><div><br></div><div>1. Uninterruptible loops are bad, so having fewer of them is better</div><div>2. These particular cases can be addressed locally using existing protocols, so the chances of negative side effects are low<br></div><div>3. The total amount of code involved is likely to be small (a dozen or so lines of C, a similar number of lines of Python in the tests) in well-isolated protocol functions, so the chances of introducing future maintainability problems are low<br></div><div>4. We have a potential contributor who is presumably offering to do the work (if that's not the case, then the question is moot anyway until a sufficiently interested volunteer turns up)<br></div><br><div><div>As an alternative implementation approach, the case could also be made that these iterators
should be raising TypeError in __length_hint__, as that protocol method is explicitly
designed to be used for finite container pre-allocation. That way things like
"list(itertools.count())" would fail immediately (similar to the way
"list(range(10**100))" already does) rather than attempting to consume all
available memory before (hopefully) finally failing with MemoryError.</div></div><div><br></div><div>If we were to do that, then we *could* make the solution to the reported problem more general by having all builtin and standard library operations that expect to be working with finite iterators (the containment testing fallback, min, max, sum, any, all, functools.reduce, etc) check for a length hint, even if they aren't actually pre-allocating any memory. Then the general purpose marker for "infinite iterator" would be "Explicitly defines __length_hint__ to raise TypeError", and it would prevent a priori all operations that attempted to fully consume the iterator.</div><div><br></div><div>That more general approach would cause some currently "working" code (like "any(itertools.count())" and "all(itertools.count())", both of which consume at most 2 items from the iterator) to raise an exception instead, and hence would require the introduction of a DeprecationWarning in 3.7 (where the affected APIs would start calling length hint, but suppress any exceptions from it), before allowing the exception to propagate in 3.8+.<br></div><div><br></div><div>Cheers,</div><div>Nick.</div><div></div><br></div>-- <br><div class="gmail_signature">Nick Coghlan | <a href="mailto:ncoghlan@gmail.com" target="_blank">ncoghlan@gmail.com</a> | Brisbane, Australia</div>
</div></div>