Membership of infinite iterators
I recently came across a bug where checking negative membership (__contains__ returns False) of an infinite iterator will freeze the program. It may seem a bit obvious, but you can check membership in range for example without iterating over the entire range. `int(1e100) in range(int(1e101))` on my machine takes about 1.5us `int(1e7) in itertools.count()` on my machine takes about 250ms (and gets worse quite quickly). Any membership test on the infinite iterators that is negative will freeze the program as stated earlier, which is odd to me. itertools.count can use the same math as range itertools.cycle could use membership from the underlying iterable itertools.repeat is even easier, just compare to the repeatable element Does anyone else think this would be useful?
On 10/15/2017 9:12 PM, Jason Campbell wrote:
I recently came across a bug where checking negative membership (__contains__ returns False) of an infinite iterator will freeze the program.
It may seem a bit obvious, but you can check membership in range for example without iterating over the entire range.
A range is a finite (non-iterator) *iterable*, not an *iterator*. In particular, a range is a finite sequence. In Python 2, ranges were implemented as lists. Python 3 now uses a more compact implementation. Either way, a ranges have a .__contains__ method. (In Py3 it is O(1) instead of O(n)).
`int(1e100) in range(int(1e101))` on my machine takes about 1.5us `int(1e7) in itertools.count()` on my machine takes about 250ms (and gets worse quite quickly).
Non-iterator iterables and iterators are different things. It you compared iter(range(n)) with itertools.count(n), you would get similar results.
Any membership test on the infinite iterators that is negative will freeze the program as stated earlier, which is odd to me.
Membership testing with iterators is generally a bad idea. In general, iterators do not have a .__contains__ method, so at best, one exhausts the iterator, making it useless. In particular, generators do not have any non-iterator, non-generator methods. The itertools module has iterator classes rather than generator functions because it is coded in C, but they are closely equivalent in behavior to the generator functions given in the doc.
itertools.count can use the same math as range
You could use range(start, 10000000000000000000[, step]) instead of count(start[, step]), or wrap count in an infinite sequence Count class that has all the methods and arithmetic of range. The __iter__ method would return a count iterator.
itertools.cycle could use membership from the underlying iterable
If the underlying iterable is an iterator, this does not work. You could define a Cycle class that requires that the input have .__contain__.
itertools.repeat is even easier, just compare to the repeatable element
Again, create a Repeat(ob) class whose .__iter__ returns repeat(ob) and that has .__contains__(x) returning 'x == ob'.
Does anyone else think this would be useful?
itertools is not going to switch from iterators to non-iterator iterables. -- Terry Jan Reedy
בתאריך יום ג׳, 17 באוק׳ 2017, 00:13, מאת Terry Reedy <tjreedy@udel.edu>:
On 10/15/2017 9:12 PM, Jason Campbell wrote: ...
itertools.cycle could use membership from the underlying iterable
If the underlying iterable is an iterator, this does not work. You could define a Cycle class that requires that the input have .__contain__.
itertools.repeat is even easier, just compare to the repeatable element
Again, create a Repeat(ob) class whose .__iter__ returns repeat(ob) and that has .__contains__(x) returning 'x == ob'.
Does anyone else think this would be useful?
itertools is not going to switch from iterators to non-iterator iterables.
It doesn't have to switch, and does not have to _require_ the input to define __contains__ method for the proposal to be meaningful. It can work with the same kind of iterables as its inputs, delegating whenever possible. I'm sure there are good reasons not to do it, but that's not an either/or decision. Elazar
Given that: - it doesn't break anything; - the behavior makes sense; - it can avoid the machine freezing when one is not careful; It can be a good idea. Those generators could have iterators that are not themselves and have a __contains__ method behaving accordingly. I only did the mistake of using "in" on those once, but it did happen. I can imagine everybody do the same at least once. It's not a huge help, but the joy of using python is a collection of small things after all. Le 17/10/2017 à 05:27, אלעזר a écrit :
בתאריך יום ג׳, 17 באוק׳ 2017, 00:13, מאת Terry Reedy <tjreedy@udel.edu <mailto:tjreedy@udel.edu>>:
On 10/15/2017 9:12 PM, Jason Campbell wrote: ... > itertools.cycle could use membership from the underlying iterable
If the underlying iterable is an iterator, this does not work. You could define a Cycle class that requires that the input have .__contain__.
> itertools.repeat is even easier, just compare to the repeatable element
Again, create a Repeat(ob) class whose .__iter__ returns repeat(ob) and that has .__contains__(x) returning 'x == ob'.
> Does anyone else think this would be useful?
itertools is not going to switch from iterators to non-iterator iterables.
It doesn't have to switch, and does not have to _require_ the input to define __contains__ method for the proposal to be meaningful. It can work with the same kind of iterables as its inputs, delegating whenever possible. I'm sure there are good reasons not to do it, but that's not an either/or decision.
Elazar
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 16 October 2017 at 11:12, Jason Campbell <j_campbell7@hotmail.com> wrote:
I recently came across a bug where checking negative membership (__contains__ returns False) of an infinite iterator will freeze the program.
It may seem a bit obvious, but you can check membership in range for example without iterating over the entire range.
As Terry noted, this is due to a semantic difference between range (compact representation of a tuple of integers) and arbitrary iterators. However, if we can avoid seemingly innocent code triggering an infinite loop, that's probably a good thing.
`int(1e100) in range(int(1e101))` on my machine takes about 1.5us `int(1e7) in itertools.count()` on my machine takes about 250ms (and gets worse quite quickly).
Any membership test on the infinite iterators that is negative will freeze the program as stated earlier, which is odd to me.
The other relevant behavioural difference is that even a successful membership test will consume the iterator up to that point.
itertools.count can use the same math as range
+1, with the caveat that a successful containment test should still advance the iterator to the point immediately after the requested element, just as it does today (if people don't want that behaviour, they should be defining a suitable finite range instead).
itertools.cycle could use membership from the underlying iterable
Sort of - you'll still need to iterate through the underlying iterator at least once in order to see all of the candidate elements. However, the containment test could still be defined as running through the full cycle at most once, such that you end up either at the point immediately after the item or else back where you started (if the item wasn't found).
itertools.repeat is even easier, just compare to the repeatable element
+1 So this sounds like a reasonable API UX improvement to me, but you'd need to ensure that you don't inadvertently change the external behaviour of *successful* containment tests. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 17 October 2017 at 16:32, Nick Coghlan <ncoghlan@gmail.com> wrote:
So this sounds like a reasonable API UX improvement to me, but you'd need to ensure that you don't inadvertently change the external behaviour of *successful* containment tests.
I should also note that there's another option here beyond just returning "False": it would also be reasonable to raise an exception like "RuntimeError('Attempted negative containment check on infinite iterator')". That way currently working code would be semantically unchanged, but code that otherwise results in an infinite loop would turn into an immediate exception instead. The rationale for this approach would be "What you are trying to do doesn't really make sense, so we're going to complain about it, rather than just giving you an answer". The rationale against the error is that "If this item is present, advance past it, otherwise don't do anything" would be an at least arguably reasonable operation to request. Whether "x in itr" is a reasonable *spelling* of that operation would then be a different question that still favoured the "raise an exception" approach. Cheers, NIck. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Tue, Oct 17, 2017 at 04:42:35PM +1000, Nick Coghlan wrote:
I should also note that there's another option here beyond just returning "False": it would also be reasonable to raise an exception like "RuntimeError('Attempted negative containment check on infinite iterator')".
I don't think that works, even if we limit discussion to just itertools.count() rather than arbitrary iterators. Obviously we cannot wait until the entire infinite iterator is checked (that might take longer than is acceptible...) but if you only check a *finite* number before giving up, you lead to false-negatives: # say we only check 100 values before raising 0 in itertools.count(1) # correctly raises 101 in itertools.count(1) # wrongly raises If we do a computed membership test, then why raise at all? We quickly know whether or not the value is in the sequence, so there's no error to report. Personally, I think a better approach is to give the specialist itertools iterator types a __contains__ method which unconditionally raises a warning regardless of whether the containment test returns true, false or doesn't return at all. Perhaps with a flag (module-wide?) to disable the warning, or turn it into an error. I think a warning (by default) is better than an error because we don't really know for sure that it is an error: n in itertools.count() is, on the face of it, no more than an error than any other potentially infinite loop: while condition(n): ... and like any so-called infinite loop, we can never be sure when to give up and raise. A thousand loops? A million? A millisecond? An hour? Whatever we pick, it will be a case of one-size fits none. I appreciate that, in practice it is easier to mess up a containment test using one of the itertools iterators than to accidentally write an infinite loop using while, and my concession to that is to raise a warning, and let the programmer decide whether to ignore it or turn it into an error. -- Steve
On 17 October 2017 at 17:44, Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Oct 17, 2017 at 04:42:35PM +1000, Nick Coghlan wrote:
I should also note that there's another option here beyond just returning "False": it would also be reasonable to raise an exception like "RuntimeError('Attempted negative containment check on infinite iterator')".
I don't think that works, even if we limit discussion to just itertools.count() rather than arbitrary iterators. Obviously we cannot wait until the entire infinite iterator is checked (that might take longer than is acceptible...) but if you only check a *finite* number before giving up, you lead to false-negatives:
# say we only check 100 values before raising 0 in itertools.count(1) # correctly raises 101 in itertools.count(1) # wrongly raises
Nobody suggested that, as it's obviously wrong. This discussion is solely about infinite iterators that have closed form containment tests, either because they're computed (itertools.count()), or because they're based on an underlying finite sequence of values (cycle(), repeat()).
If we do a computed membership test, then why raise at all? We quickly know whether or not the value is in the sequence, so there's no error to report.
Because we should probably always be raising for these particular containment checks, and it's safe to start doing so in the negative case, since that's currently a guaranteed infinite loop. And unlike a "while True" loop (which has many real world applications), none of these implicit infinite loops allow for any kind of useful work on each iteration, they just end up in a tight loop deep inside the interpreter internals, doing absolutely nothing. They won't even check for signals or release the GIL, so you'll need to ask the operating system to clobber the entire process to break out of it - Ctrl-C will be ignored. I'd also have no major objection to deprecating containment tests on these iterators entirely, but that doesn't offer the same kind of UX benefit that replacing an infinite loop with an immediate exception does, so I think the two questions should be considered separately. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
17.10.17 09:42, Nick Coghlan пише:
On 17 October 2017 at 16:32, Nick Coghlan <ncoghlan@gmail.com <mailto:ncoghlan@gmail.com>> wrote:
So this sounds like a reasonable API UX improvement to me, but you'd need to ensure that you don't inadvertently change the external behaviour of *successful* containment tests.
I should also note that there's another option here beyond just returning "False": it would also be reasonable to raise an exception like "RuntimeError('Attempted negative containment check on infinite iterator')".
What about other operations with infinite iterators? min(count()), max(count()), all(count(1))? Do you want to implement special cases for all of them?
On 17 October 2017 at 19:19, Serhiy Storchaka <storchaka@gmail.com> wrote:
17.10.17 09:42, Nick Coghlan пише:
On 17 October 2017 at 16:32, Nick Coghlan <ncoghlan@gmail.com <mailto: ncoghlan@gmail.com>> wrote:
So this sounds like a reasonable API UX improvement to me, but you'd need to ensure that you don't inadvertently change the external behaviour of *successful* containment tests.
I should also note that there's another option here beyond just returning "False": it would also be reasonable to raise an exception like "RuntimeError('Attempted negative containment check on infinite iterator')".
What about other operations with infinite iterators? min(count()), max(count()), all(count(1))? Do you want to implement special cases for all of them?
No, as folks expect those to iterate without the opportunity to break out, and are hence more careful with them when infinite iterators are part of their application. We also don't have any existing protocols we could use to intercept them, even if we decided we *did* want to do so. The distinction I see with "x in y" is: 1. It's pretty easy to write "for x in y in y" when you really meant to write "for x in y", and if "y" is an infinite iterator, the "y in y" part will become an unbreakable infinite loop when executed instead of the breakable one you intended (especially annoying if it means you have to discard and restart a REPL session due to it, and that's exactly where that kind of typo is going to be easiest to make) 2. Containment testing already has a dedicated protocol so containers can implement optimised containment tests, which means it's also trivial for an infinite iterator to intercept and explicitly disallow containment checks if it chooses to do so So the problem is more likely to be encountered due to "x in y" appearing in both the containment test syntax and as part of the iteration syntax, *and* it's straightforward to do something about it because the __contains__ hook already exists. Those two points together are enough for me to say "Sure, it makes sense to replace the current behaviour with something more user friendly". If either of them was false, then I'd say "No, that's not worth the hassle of changing anything". Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
17.10.17 14:10, Nick Coghlan пише:
1. It's pretty easy to write "for x in y in y" when you really meant to write "for x in y", and if "y" is an infinite iterator, the "y in y" part will become an unbreakable infinite loop when executed instead of the breakable one you intended (especially annoying if it means you have to discard and restart a REPL session due to it, and that's exactly where that kind of typo is going to be easiest to make)
I think it is better to left this on linters. I never encountered this mistake and doubt it is common. In any case the first execution of this code will expose the mistake.
2. Containment testing already has a dedicated protocol so containers can implement optimised containment tests, which means it's also trivial for an infinite iterator to intercept and explicitly disallow containment checks if it chooses to do so
But this has non-zero maintaining cost. As the one who made many changes in itertools.c I don't like the idea of increasing its complexity for optimizing a pretty rare case. And note that the comparison can have side effect. You can implement the optimization of `x in count()` only for the limited set of builtin types. For example `x in range()` is optimized only for exact int and bool. You can't guarantee the finite time for cycle() and repeat() either since they can emit values of arbitrary types, with arbitrary __eq__.
On Tue, Oct 17, 2017 at 2:46 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
17.10.17 14:10, Nick Coghlan пише:
1. It's pretty easy to write "for x in y in y" when you really meant to write "for x in y", and if "y" is an infinite iterator, the "y in y" part will become an unbreakable infinite loop when executed instead of the breakable one you intended (especially annoying if it means you have to discard and restart a REPL session due to it, and that's exactly where that kind of typo is going to be easiest to make)
I think it is better to left this on linters.
Just to note that there is currently nothing that would prevent making `for x in y in z` a syntax error. There is nothing meaningful that it could do, really, because y in z can only return True or False (or raise an Exception or loop infinitely). But for an infinite iterable, the right answer may be Maybe ;) ––Koos
I never encountered this mistake and doubt it is common. In any case the first execution of this code will expose the mistake.
2. Containment testing already has a dedicated protocol so containers can
implement optimised containment tests, which means it's also trivial for an infinite iterator to intercept and explicitly disallow containment checks if it chooses to do so
But this has non-zero maintaining cost. As the one who made many changes in itertools.c I don't like the idea of increasing its complexity for optimizing a pretty rare case.
And note that the comparison can have side effect. You can implement the optimization of `x in count()` only for the limited set of builtin types. For example `x in range()` is optimized only for exact int and bool. You can't guarantee the finite time for cycle() and repeat() either since they can emit values of arbitrary types, with arbitrary __eq__.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- + Koos Zevenhoven + http://twitter.com/k7hoven +
On 17 October 2017 at 23:17, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Tue, Oct 17, 2017 at 2:46 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
17.10.17 14:10, Nick Coghlan пише:
1. It's pretty easy to write "for x in y in y" when you really meant to write "for x in y", and if "y" is an infinite iterator, the "y in y" part will become an unbreakable infinite loop when executed instead of the breakable one you intended (especially annoying if it means you have to discard and restart a REPL session due to it, and that's exactly where that kind of typo is going to be easiest to make)
I think it is better to left this on linters.
Just to note that there is currently nothing that would prevent making `for x in y in z` a syntax error. There is nothing meaningful that it could do, really, because y in z can only return True or False (or raise an Exception or loop infinitely).
That was just an example of one of the ways we can accidentally end up writing "x in y" at the REPL, where "y" is an infinite iterator, since it's the kind that's specific to "x in y", whereas other forms (like accidentally using the wrong variable name) also apply to other iterator consuming APIs (like the ones Serhiy mentioned). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 17 October 2017 at 21:46, Serhiy Storchaka <storchaka@gmail.com> wrote:
17.10.17 14:10, Nick Coghlan пише:
1. It's pretty easy to write "for x in y in y" when you really meant to write "for x in y", and if "y" is an infinite iterator, the "y in y" part will become an unbreakable infinite loop when executed instead of the breakable one you intended (especially annoying if it means you have to discard and restart a REPL session due to it, and that's exactly where that kind of typo is going to be easiest to make)
I think it is better to left this on linters. I never encountered this mistake and doubt it is common. In any case the first execution of this code will expose the mistake.
People don't run linters at the REPL, and it's at the REPL where accidentally getting an unbreakable infinite loop is most annoying. Keep in mind we're not talking about a regular loop you can break out of with Ctrl-C here - we're talking about a tight loop inside the interpreter internals that leads to having to kill the whole host process just to get out of it.
2. Containment testing already has a dedicated protocol so containers can
implement optimised containment tests, which means it's also trivial for an infinite iterator to intercept and explicitly disallow containment checks if it chooses to do so
But this has non-zero maintaining cost. As the one who made many changes in itertools.c I don't like the idea of increasing its complexity for optimizing a pretty rare case.
It's not an optimisation, it's a UX improvement for the interactive prompt. The maintenance burden should be low, as it's highly unlikely we'd ever need to change this behaviour again in the future (I do think deprecating the success case would be more trouble than it would be worth though).
And note that the comparison can have side effect. You can implement the optimization of `x in count()` only for the limited set of builtin types. For example `x in range()` is optimized only for exact int and bool. You can't guarantee the finite time for cycle() and repeat() either since they can emit values of arbitrary types, with arbitrary __eq__.
We're not trying to guarantee finite execution time in general, we're just making it more likely that either Ctrl-C works, or else you don't get stuck in an infinite loop in the first place. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
17.10.17 17:06, Nick Coghlan пише:
Keep in mind we're not talking about a regular loop you can break out of with Ctrl-C here - we're talking about a tight loop inside the interpreter internals that leads to having to kill the whole host process just to get out of it.
And this is the root of the issue. Just let more tight loops be interruptible with Ctrl-C, and this will fix the more general issue.
On Tue, Oct 17, 2017 at 5:26 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
17.10.17 17:06, Nick Coghlan пише:
Keep in mind we're not talking about a regular loop you can break out of with Ctrl-C here - we're talking about a tight loop inside the interpreter internals that leads to having to kill the whole host process just to get out of it.
And this is the root of the issue. Just let more tight loops be interruptible with Ctrl-C, and this will fix the more general issue.
Not being able to interrupt something with Ctrl-C in the repl or with the interrupt command in Jupyter notebooks is definitely a thing I sometimes encounter. A pity I don't remember when it happens, because I usually forget it very soon after I've restarted the kernel and continued working. But my guess is it's usually not because of an infinite iterator. Regarding what the OP might have been after, and just for some wild brainstorming based on true stories: In some sense, x in y should always have an answer, even if it may be expensive to compute. Currently, it's possible to implement "lazy truth values" which compute the bool value lazily when .__bool__() is called. Until you call bool(..) on it, it would just be Maybe, and then after the call, you'd actually have True or False. In many cases it can even be enough to know if something is Maybe true. Also, if you do something like any(*truth_values), then you could skip the Maybe ones on the first pass, because if you find one that's plain True, you already have the answer. Regarding `x in y`, where y is an infinite iterable without well defined contents, that would return an instance of MaybeType, but .__bool__() would raise an exception. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
On 18 October 2017 at 03:39, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Tue, Oct 17, 2017 at 5:26 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
17.10.17 17:06, Nick Coghlan пише:
Keep in mind we're not talking about a regular loop you can break out of with Ctrl-C here - we're talking about a tight loop inside the interpreter internals that leads to having to kill the whole host process just to get out of it.
And this is the root of the issue. Just let more tight loops be interruptible with Ctrl-C, and this will fix the more general issue.
Not being able to interrupt something with Ctrl-C in the repl or with the interrupt command in Jupyter notebooks is definitely a thing I sometimes encounter. A pity I don't remember when it happens, because I usually forget it very soon after I've restarted the kernel and continued working. But my guess is it's usually not because of an infinite iterator.
Fixing the general case is hard, because the assumption that signals are only checked between interpreter opcodes is a pervasive one throughout the interpreter internals. We certainly *could* redefine affected C APIs as potentially raising KeyboardInterrupt (adjusting the signal management infrastructure accordingly), and if someone actually follows through and implements that some day, then the argument could then be made that given such change, it might be reasonable to drop any a priori guards that we have put in place for particular *detectable* uninterruptible infinite loops. However, that's not the design question being discussed in this thread. The design question here is "We have 3 known uninterruptible infinite loops that are easy to detect and prevent. Should we detect and prevent them?". "We shouldn't allow anyone to do this easy thing, because it would be preferable for someone to instead do this hard and complicated thing that nobody is offering to do" isn't a valid design argument in that situation. And I have a four step check for that which prompts me to say "Yes, we should detect and prevent them": 1. Uninterruptible loops are bad, so having fewer of them is better 2. These particular cases can be addressed locally using existing protocols, so the chances of negative side effects are low 3. The total amount of code involved is likely to be small (a dozen or so lines of C, a similar number of lines of Python in the tests) in well-isolated protocol functions, so the chances of introducing future maintainability problems are low 4. We have a potential contributor who is presumably offering to do the work (if that's not the case, then the question is moot anyway until a sufficiently interested volunteer turns up) As an alternative implementation approach, the case could also be made that these iterators should be raising TypeError in __length_hint__, as that protocol method is explicitly designed to be used for finite container pre-allocation. That way things like "list(itertools.count())" would fail immediately (similar to the way "list(range(10**100))" already does) rather than attempting to consume all available memory before (hopefully) finally failing with MemoryError. If we were to do that, then we *could* make the solution to the reported problem more general by having all builtin and standard library operations that expect to be working with finite iterators (the containment testing fallback, min, max, sum, any, all, functools.reduce, etc) check for a length hint, even if they aren't actually pre-allocating any memory. Then the general purpose marker for "infinite iterator" would be "Explicitly defines __length_hint__ to raise TypeError", and it would prevent a priori all operations that attempted to fully consume the iterator. That more general approach would cause some currently "working" code (like "any(itertools.count())" and "all(itertools.count())", both of which consume at most 2 items from the iterator) to raise an exception instead, and hence would require the introduction of a DeprecationWarning in 3.7 (where the affected APIs would start calling length hint, but suppress any exceptions from it), before allowing the exception to propagate in 3.8+. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
18.10.17 13:22, Nick Coghlan пише:
2.. These particular cases can be addressed locally using existing protocols, so the chances of negative side effects are low
Only the particular case `count() in count()` can be addressed without breaking the following examples:
class C: ... def __init__(self, count): ... self.count = count ... def __eq__(self, other): ... print(self.count, other) ... if not self.count: ... return True ... self.count -= 1 ... return False ... import itertools C(5) in itertools.count() 5 0 4 1 3 2 2 3 1 4 0 5 True it = itertools.cycle([C(5)]); it in it 5 <itertools.cycle object at 0x7f65512d39b8> 4 <itertools.cycle object at 0x7f65512d39b8> 3 <itertools.cycle object at 0x7f65512d39b8> 2 <itertools.cycle object at 0x7f65512d39b8> 1 <itertools.cycle object at 0x7f65512d39b8> 0 <itertools.cycle object at 0x7f65512d39b8> True it = itertools.repeat(C(5)); it in it 5 repeat(<__main__.C object at 0x7f65512c5dc0>) 4 repeat(<__main__.C object at 0x7f65512c5dc0>) 3 repeat(<__main__.C object at 0x7f65512c5dc0>) 2 repeat(<__main__.C object at 0x7f65512c5dc0>) 1 repeat(<__main__.C object at 0x7f65512c5dc0>) 0 repeat(<__main__.C object at 0x7f65512c5dc0>) True
3. The total amount of code involved is likely to be small (a dozen or so lines of C, a similar number of lines of Python in the tests) in well-isolated protocol functions, so the chances of introducing future maintainability problems are low
It depends on what you want to achieve. Just prevent an infinity loop in `count() in count()`, or optimize `int in count()`, or optimize more special cases.
4. We have a potential contributor who is presumably offering to do the work (if that's not the case, then the question is moot anyway until a sufficiently interested volunteer turns up)
Maintaining is more than writing an initial code.
If we were to do that, then we *could* make the solution to the reported problem more general by having all builtin and standard library operations that expect to be working with finite iterators (the containment testing fallback, min, max, sum, any, all, functools.reduce, etc) check for a length hint, even if they aren't actually pre-allocating any memory.
This will add a significant overhead for relatively short (hundreds of items) sequences. I already did benchmarking for similar cases in the past.
On 18 October 2017 at 22:53, Serhiy Storchaka <storchaka@gmail.com> wrote:
18.10.17 13:22, Nick Coghlan пише:
2.. These particular cases can be addressed locally using existing protocols, so the chances of negative side effects are low
Only the particular case `count() in count()` can be addressed without breaking the following examples:
You're right, the potential impact on objects with weird __eq__ implementations would mean that even the `__contains__` approach would require deprecation warnings for the APIs that allow short-circuiting. So we can discard it in favour of exploring the "Can we make a beneficial improvement via __length_hint__?" question.
3. The total amount of code involved is likely to be small (a dozen or so
lines of C, a similar number of lines of Python in the tests) in well-isolated protocol functions, so the chances of introducing future maintainability problems are low
It depends on what you want to achieve. Just prevent an infinity loop in `count() in count()`, or optimize `int in count()`, or optimize more special cases.
My interest lies specifically in reducing the number of innocent looking ways we offer to provoke uninterruptible infinite loops or unbounded memory consumption.
4. We have a potential contributor who is presumably offering to do the
work (if that's not the case, then the question is moot anyway until a sufficiently interested volunteer turns up)
Maintaining is more than writing an initial code.
Aye, that's why the preceding point was to ask how large a change we'd be offering to maintain indefinitely, and how well isolated that change would be.
If we were to do that, then we *could* make the solution to the reported
problem more general by having all builtin and standard library operations that expect to be working with finite iterators (the containment testing fallback, min, max, sum, any, all, functools.reduce, etc) check for a length hint, even if they aren't actually pre-allocating any memory.
This will add a significant overhead for relatively short (hundreds of items) sequences. I already did benchmarking for similar cases in the past.
I did wonder about that, so I guess the baseline zero-risk enhancement idea would be to only prevent the infinite loop in cases that already request a length hint as a memory pre-allocation check. That would reduce the likelihood of the most painful case (grinding the machine to a halt), without worrying about the less painful cases (which will break the current process, but the rest of the machine will be fine). Given that, adding TypeError raising __length_hint__ implementations to itertools.count(), itertools.cycle(), and itertools.repeat() would make sense as an independent RFE, without worrying about any APIs that don't already check for a length hint. A more intrusive option would then be to look at breaking the other tight iteration loops into two phases, such that checking for potentially infinite iterators could be delayed until after the first thousand iterations or so. That option is potentially worth exploring anyway, since breaking up the current single level loops as nested loops would be a pre-requisite for allowing these APIs to check for signals while they're running while keeping the per-iteration overhead low (only one pre-requisite of many though, and probably one of the easier ones). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan wrote:
since breaking up the current single level loops as nested loops would be a pre-requisite for allowing these APIs to check for signals while they're running while keeping the per-iteration overhead low
Is there really much overhead? Isn't it just checking a flag? -- Greg
On 19 October 2017 at 08:34, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Nick Coghlan wrote:
since breaking up the current single level loops as nested loops would be a pre-requisite for allowing these APIs to check for signals while they're running while keeping the per-iteration overhead low
Is there really much overhead? Isn't it just checking a flag?
It's checking an atomically updated flag, so it forces CPU cache synchronisation, which means you don't want to be doing it on every iteration of a low level loop. However, reviewing Serhiy's PR reminded me that PyErr_CheckSignals() already encapsulates the "Should this thread even be checking for signals in the first place?" logic, which means the code change to make the itertools iterators inherently interruptible with Ctrl-C is much smaller than I thought it would be. That approach is also clearly safe from an exception handling point of view, since all consumer loops already need to cope with the fact that itr.__next__() may raise arbitrary exceptions (including KeyboardInterrupt). So that change alone already offers a notable improvement, and combining it with a __length_hint__() implementation that keeps container constructors from even starting to iterate would go even further towards making the infinite iterators more user friendly. Similar signal checking changes to the consumer loops would also be possible, but I don't think that's an either/or decision: changing the iterators means they'll be interruptible for any consumer, while changing the consumers would make them interruptible for any iterator, and having checks in both the producer & the consumer merely means that you'll be checking for signals twice every 65k iterations, rather than once. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan wrote:
having checks in both the producer & the consumer merely means that you'll be checking for signals twice every 65k iterations, rather than once.
Here's a possible reason for wanting checks in the producers: If your producer happens to take a long time per iteration, and the consumer only checks every 65k iterations, it might be a while before a Ctrl-C takes effect. If the producer is checking, it is likely to have a better idea of what an appropriate checking interval might be. -- Greg
On Thu, Oct 19, 2017 at 3:42 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 19 October 2017 at 08:34, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Nick Coghlan wrote:
since breaking up the current single level loops as nested loops would be a pre-requisite for allowing these APIs to check for signals while they're running while keeping the per-iteration overhead low
Is there really much overhead? Isn't it just checking a flag?
It's checking an atomically updated flag, so it forces CPU cache synchronisation, which means you don't want to be doing it on every iteration of a low level loop.
Even just that it's a C function call makes me not want to recommend doing it in a lot of tight loops. Who knows what the function does anyway, let alone what it might or might not do in the future.
However, reviewing Serhiy's PR reminded me that PyErr_CheckSignals() already encapsulates the "Should this thread even be checking for signals in the first place?" logic, which means the code change to make the itertools iterators inherently interruptible with Ctrl-C is much smaller than I thought it would be.
And if it didn't encapsulate that, you would probably have written a wrapper that does. Good thing it's the wrapper that's exposed in the API.
That approach is also clearly safe from an exception handling point of view, since all consumer loops already need to cope with the fact that itr.__next__() may raise arbitrary exceptions (including KeyboardInterrupt).
So that change alone already offers a notable improvement, and combining it
with a __length_hint__() implementation that keeps container constructors from even starting to iterate would go even further towards making the infinite iterators more user friendly.
Similar signal checking changes to the consumer loops would also be possible, but I don't think that's an either/or decision: changing the iterators means they'll be interruptible for any consumer, while changing the consumers would make them interruptible for any iterator, and having checks in both the producer & the consumer merely means that you'll be checking for signals twice every 65k iterations, rather than once.
Indeed it's not strictly an either/or decision, but more about where we might spend time executing C code. But I'm leaning a bit towards doing it on the consumer side, because there it's more obvious that the code might take some time to run. If the consumer ends up iterating over pure-Python objects, there are no concerns about the overhead. But if it *does* call a C-implemented __next__, then that's the case where we actully need the whole thing. Adding the check in both places would double the (small) overhead. And nested (wrapped) iterators are also a thing. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
Let me add something to the discussion that we had last month about code that cannot be interrupted with Ctrl-C or with the Jupyter Notebook interrupt command etc. I wasn't completely sure if anyone cared much about being able to interrupt long-running code and about things like atomic operations etc., but maybe I'll just write this one down anyway while it's still in my head: As a reminder of what we discussed, for KeyboardInterrupt to happen, C code needs to manually call PyErr_CheckSignals() regularly to handle signals, but it might be too much overhead to do that in tight loops. But where does the overhead come from and what could perhaps be done? Read below: On Thu, Oct 19, 2017 at 10:54 AM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Thu, Oct 19, 2017 at 3:42 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 19 October 2017 at 08:34, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Nick Coghlan wrote:
since breaking up the current single level loops as nested loops would be a pre-requisite for allowing these APIs to check for signals while they're running while keeping the per-iteration overhead low
Is there really much overhead? Isn't it just checking a flag?
It's checking an atomically updated flag, so it forces CPU cache synchronisation, which means you don't want to be doing it on every iteration of a low level loop.
Even just that it's a C function call makes me not want to recommend doing it in a lot of tight loops. Who knows what the function does anyway, let alone what it might or might not do in the future.
While what Nick states above about CPU cache synchronization doesn't sound quite **right**, it may take a core developer a whole day to roughly figure out if the statement is even **true**. In particular, it is not obvious if it may **occasionally** be true, or if it is **always** true in **some cases**, or if it's *sometimes* true in *all cases*. The difficulties are partly because pyatomic.h [1] is "arranged" in layers of branched C preprocessor directives [2], and because the whole of pyatomic.h is in fact "deep magic", as is confirmed by a comment in the beginning of the file. Some of the complexity has emerged after that comment was already there. SPOILER ALERT! At the moment, Nick's statement is in fact **always** true in **all** cases (at least when ignoring untypical cases and some inaccuracies in phrasing). Another question is whether the statement **should** be true at all. PyErr_CheckSignals(), the function that checks for pending signals, now **implicitly** uses the strictest possible memory-order requirement (SEQ_CST) for checking the `is_tripped` flag, a value which can be used to peek whether there are any pending signals. This means that two threads that call PyErr_CheckSignals can't even **check** the value of that flag at the same time, and they'll have to wait for each other and whatever the CPU hardware implementation does for cache synchronzation. From a correctness point of view, that is absolutely great: if PyErr_CheckSignals() is called, it is guaranteed to notice a new signal regardles of how small the number of picoseconds after the `is_tripped` flag has been set. But is that really important? Do we really need things to be slowed down by this? And do we want people to "optimize" code by avoiding signal checking? The signals won't be caught until checked anyway, and as we've seen, one solution is to use a counter to determine if enough time has passed that another check for potential signals should happen. That does, however, raise the question of who should own the counter, and if it's OK for multiple threads to use the same counter. If yes, then would we be able to avoid slow atomic decrements (or increments)? But another solution might be to make a less strict but almost equivalent functionality with much less overhead. Something like this in a header file: static inline int PyErr_PROBE_SIGNALS() { static int volatile *flag = (int volatile *) &is_tripped; if (*flag) { return PyErr_CheckSignals(); } else { return 0; } } Here, the flag is casted to volatile to force a check to happen each time from memory/cache. However, to make it more standard and to make sure it works with all compilers/platforms, it might be better to, instead of *flag, use an atomic load with "MEMORY_ORDER_RELAXED". Another thing is that `is_tripped`, which is currently declared as static inside signalmodule.c [4], should somehow be exposed in the API to make it available for the inline function in headers. This solution might be fast enough for a lot of cases, although still slightly slower than the counter approach, at least if the counter approach would completely avoid per-iteration memory access by requiring each function to own the counter that is used for this. So you read the whole email? Cool! ––Koos [1] https://github.com/python/cpython/blob/master/Include/pyatomic.h [2] C preprocessor code is kind of like commented-out Python code, except that not even the Spanish Inquisition [3] expects you to indent the blocks that they define. [3] https://www.youtube.com/watch?v=Nf_Y4MbUCLY [4] https://github.com/python/cpython/blob/master/Modules/signalmodule.c
However, reviewing Serhiy's PR reminded me that PyErr_CheckSignals() already encapsulates the "Should this thread even be checking for signals in the first place?" logic, which means the code change to make the itertools iterators inherently interruptible with Ctrl-C is much smaller than I thought it would be.
And if it didn't encapsulate that, you would probably have written a wrapper that does. Good thing it's the wrapper that's exposed in the API.
That approach is also clearly safe from an exception handling point of view, since all consumer loops already need to cope with the fact that itr.__next__() may raise arbitrary exceptions (including KeyboardInterrupt).
So that change alone already offers a notable improvement, and combining
it with a __length_hint__() implementation that keeps container constructors from even starting to iterate would go even further towards making the infinite iterators more user friendly.
Similar signal checking changes to the consumer loops would also be possible, but I don't think that's an either/or decision: changing the iterators means they'll be interruptible for any consumer, while changing the consumers would make them interruptible for any iterator, and having checks in both the producer & the consumer merely means that you'll be checking for signals twice every 65k iterations, rather than once.
Indeed it's not strictly an either/or decision, but more about where we might spend time executing C code. But I'm leaning a bit towards doing it on the consumer side, because there it's more obvious that the code might take some time to run.
If the consumer ends up iterating over pure-Python objects, there are no concerns about the overhead. But if it *does* call a C-implemented __next__, then that's the case where we actully need the whole thing. Adding the check in both places would double the (small) overhead. And nested (wrapped) iterators are also a thing.
––Koos
-- + Koos Zevenhoven + http://twitter.com/k7hoven +
-- + Koos Zevenhoven + http://twitter.com/k7hoven +
On 2 November 2017 at 04:29, Koos Zevenhoven <k7hoven@gmail.com> wrote:
SPOILER ALERT! At the moment, Nick's statement is in fact **always** true in **all** cases (at least when ignoring untypical cases and some inaccuracies in phrasing). Another question is whether the statement **should** be true at all.
PyErr_CheckSignals(), the function that checks for pending signals, now **implicitly** uses the strictest possible memory-order requirement (SEQ_CST) for checking the `is_tripped` flag, a value which can be used to peek whether there are any pending signals. This means that two threads that call PyErr_CheckSignals can't even **check** the value of that flag at the same time, and they'll have to wait for each other and whatever the CPU hardware implementation does for cache synchronzation.
Nice, that's deeper than I went - I just assumed there was an interlocked_increment in there somewhere, without checking whether or not there were any optimised code paths that managed to avoid that call :)
From a correctness point of view, that is absolutely great: if PyErr_CheckSignals() is called, it is guaranteed to notice a new signal regardles of how small the number of picoseconds after the `is_tripped` flag has been set. But is that really important? Do we really need things to be slowed down by this? And do we want people to "optimize" code by avoiding signal checking?
It isn't signal handling latency that's the concern here, it's potentially missing signals. Consider the case where we have 3 threads: A, B, C. A is the main thread that will actually handle signals, B and C both happened to receive them. We'll also assume the two threads receive *different* signals (otherwise they'll potentially get collapsed into one notification regardless). The scenario we want to avoid is having one or both of the signals set, but is_tripped cleared. With an interlocked check (where both 0->1 and 1->0 are memory barriers), that clearly can't happen. I suspect you're right that this could also be achieved with a less strict memory sync requirement on the 0->1 check, but that's much harder to reason about than simply limiting the number of iterations we make through an iterator consumption loop before checking for signals.
The signals won't be caught until checked anyway, and as we've seen, one solution is to use a counter to determine if enough time has passed that another check for potential signals should happen. That does, however, raise the question of who should own the counter, and if it's OK for multiple threads to use the same counter. If yes, then would we be able to avoid slow atomic decrements (or increments)?
But another solution might be to make a less strict but almost equivalent functionality with much less overhead. Something like this in a header file:
static inline int PyErr_PROBE_SIGNALS() { static int volatile *flag = (int volatile *) &is_tripped; if (*flag) { return PyErr_CheckSignals(); } else { return 0; } }
Here, the flag is casted to volatile to force a check to happen each time from memory/cache. However, to make it more standard and to make sure it works with all compilers/platforms, it might be better to, instead of *flag, use an atomic load with "MEMORY_ORDER_RELAXED". Another thing is that `is_tripped`, which is currently declared as static inside signalmodule.c [4], should somehow be exposed in the API to make it available for the inline function in headers.
This solution might be fast enough for a lot of cases, although still slightly slower than the counter approach, at least if the counter approach would completely avoid per-iteration memory access by requiring each function to own the counter that is used for this.
One thing I like about a nested inner loop is that it's easy to understand the rationale for it *without* knowing any of the details of how memory synchronisation works, as it's just: - we want to check for signals regularly so the latency in interrupt handling isn't too high - PyErr_CheckSignals() is too expensive to call on every iteration - so we only check every N iterations Memory synchronisation then only comes up if someone asks why "PyErr_CheckSignals" is expensive to call. And while it wouldn't surprise at all if you're right and there are ways to make that call cheaper, they're still never going to be cheaper than explicitly reducing the frequency with which it is called. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Nov 2, 2017 at 3:39 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 2 November 2017 at 04:29, Koos Zevenhoven <k7hoven@gmail.com> wrote:
SPOILER ALERT! At the moment, Nick's statement is in fact **always** true in **all** cases (at least when ignoring untypical cases and some inaccuracies in phrasing). Another question is whether the statement **should** be true at all.
PyErr_CheckSignals(), the function that checks for pending signals, now **implicitly** uses the strictest possible memory-order requirement (SEQ_CST) for checking the `is_tripped` flag, a value which can be used to peek whether there are any pending signals. This means that two threads that call PyErr_CheckSignals can't even **check** the value of that flag at the same time, and they'll have to wait for each other and whatever the CPU hardware implementation does for cache synchronzation.
Nice, that's deeper than I went - I just assumed there was an interlocked_increment in there somewhere, without checking whether or not there were any optimised code paths that managed to avoid that call :)
From a correctness point of view, that is absolutely great: if PyErr_CheckSignals() is called, it is guaranteed to notice a new signal regardles of how small the number of picoseconds after the `is_tripped` flag has been set. But is that really important? Do we really need things to be slowed down by this? And do we want people to "optimize" code by avoiding signal checking?
It isn't signal handling latency that's the concern here, it's potentially missing signals. Consider the case where we have 3 threads: A, B, C. A is the main thread that will actually handle signals, B and C both happened to receive them. We'll also assume the two threads receive *different* signals (otherwise they'll potentially get collapsed into one notification regardless).
The scenario we want to avoid is having one or both of the signals set, but is_tripped cleared. With an interlocked check (where both 0->1 and 1->0 are memory barriers), that clearly can't happen. I suspect you're right that this could also be achieved with a less strict memory sync requirement on the 0->1 check, but that's much harder to reason about than simply limiting the number of iterations we make through an iterator consumption loop before checking for signals.
Missing signals is a concern, and there's another concern that one received signal might be interpreted as two separate occurrences. But it doesn't prevent us from *checking* is_tripped in a more relaxed manner, as long as the 0->1 check is *verified* using something like SEQ_CST and the associated signal handling is synchronized properly. And that's why my PyErr_PROBE_SIGNALS below should work, and wouldn't prevent multiple threads from handling signals either.
The signals won't be caught until checked anyway, and as we've seen, one solution is to use a counter to determine if enough time has passed that another check for potential signals should happen. That does, however, raise the question of who should own the counter, and if it's OK for multiple threads to use the same counter. If yes, then would we be able to avoid slow atomic decrements (or increments)?
But another solution might be to make a less strict but almost equivalent functionality with much less overhead. Something like this in a header file:
static inline int PyErr_PROBE_SIGNALS() { static int volatile *flag = (int volatile *) &is_tripped; if (*flag) { return PyErr_CheckSignals(); } else { return 0; } }
Here, the flag is casted to volatile to force a check to happen each time from memory/cache. However, to make it more standard and to make sure it works with all compilers/platforms, it might be better to, instead of *flag, use an atomic load with "MEMORY_ORDER_RELAXED". Another thing is that `is_tripped`, which is currently declared as static inside signalmodule.c [4], should somehow be exposed in the API to make it available for the inline function in headers.
This solution might be fast enough for a lot of cases, although still slightly slower than the counter approach, at least if the counter approach would completely avoid per-iteration memory access by requiring each function to own the counter that is used for this.
One thing I like about a nested inner loop is that it's easy to understand the rationale for it *without* knowing any of the details of how memory synchronisation works, as it's just:
- we want to check for signals regularly so the latency in interrupt handling isn't too high - PyErr_CheckSignals() is too expensive to call on every iteration - so we only check every N iterations
My initial "negative" reaction to that potential solution ("I don't believe nesting loops is strictly a requirement." –K) was because going for manually splitting the loops into smaller inner loops sounds like preparing a nightmare for all C programmers. And I don't think it's possible to write a preprocessor macro that does that for you. Maybe things like Cython could in principle do this automatically, though. Using a counter can have a very small overhead, because things can happen completely in processor registers.
Memory synchronisation then only comes up if someone asks why "PyErr_CheckSignals" is expensive to call. And while it wouldn't surprise at all if you're right and there are ways to make that call cheaper, they're still never going to be cheaper than explicitly reducing the frequency with which it is called.
There's a limit to how cheap the call to PyErr_CheckSignals() can be. As I mentioned earlier, even just the fact that it's a C function call can be too much. That's why, in the above, I used a new name PyErr_PROBE_SIGNALS() instead of optimizing the existing PyErr_CheckSignals() –– turning PyErr_CheckSignals() into a static inline function would change the ABI. I also didn't call it PyErr_CHECK_SIGNALS() because the functionality is not strictly equivalent to the existing function. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
On Thu, 2 Nov 2017 13:27:17 +0200 Koos Zevenhoven <k7hoven@gmail.com> wrote:
There's a limit to how cheap the call to PyErr_CheckSignals() can be. As I mentioned earlier, even just the fact that it's a C function call can be too much.
That's why, in the above, I used a new name PyErr_PROBE_SIGNALS() instead of optimizing the existing PyErr_CheckSignals() –– turning PyErr_CheckSignals() into a static inline function would change the ABI. I also didn't call it PyErr_CHECK_SIGNALS() because the functionality is not strictly equivalent to the existing function.
Please. If you want to replace PyErr_CheckSignals() with something faster, the first thing to do is to prove that PyErr_CheckSignals() *is* too expensive. Regards Antoine.
On Thu, Nov 2, 2017 at 1:41 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Thu, 2 Nov 2017 13:27:17 +0200 Koos Zevenhoven <k7hoven@gmail.com> wrote:
There's a limit to how cheap the call to PyErr_CheckSignals() can be. As I mentioned earlier, even just the fact that it's a C function call can be too much.
That's why, in the above, I used a new name PyErr_PROBE_SIGNALS() instead of optimizing the existing PyErr_CheckSignals() –– turning PyErr_CheckSignals() into a static inline function would change the ABI. I also didn't call it PyErr_CHECK_SIGNALS() because the functionality is not strictly equivalent to the existing function.
Please. If you want to replace PyErr_CheckSignals() with something faster, the first thing to do is to prove that PyErr_CheckSignals() *is* too expensive.
I believe Serhiy proved that by showing that his first patch did not have negligible overhead for all cases. One might also write a C-implemented fibonacci calculator or similar to prove the overhead is significant. While I agree that there's a hint of premature optimization involved, my justification for it is that I want people to be confident that when they add the check, it's not going to slow down the loop significantly. Maybe the fear of overhead is one of reasons for people to write uninterruptible code. IMO, it should be made as easy as possible for the C programmer. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
On Wed, 1 Nov 2017 20:29:56 +0200 Koos Zevenhoven <k7hoven@gmail.com> wrote:
From a correctness point of view, that is absolutely great: if PyErr_CheckSignals() is called, it is guaranteed to notice a new signal regardles of how small the number of picoseconds after the `is_tripped` flag has been set. But is that really important?
I was going to answer "no"... but actually yes. The important case is the event loop type case: while (1) { select([some file descriptors]); if (errno == EINTR) { PyErr_CheckSignals(); if (PyErr_Occurred()) break; } /* continue select()ing... */ } Now say at a given point in time, no fds are actually active (or even waited for), but some signal arrives (SIGINT perhaps). select() is woken up and returns with errno EINTR. Two things then can happen: - if PyErr_CheckSignals() notices the signal, it will run the relevant signal handler, which may raise an exception and trigger the select() loop to exit (e.g. SIGINT would raise KeyboardInterrupt) - if PyErr_CheckSignals() misses the signal, the loop will enter again, and select() may sleep for an infinite amount of time Of course, what we're doing with select() above can already apply for read() or other interruptible syscalls waiting for outside data... and that pattern is present a lot internally, especially since PEP 475 ("Retry system calls failing with EINTR"). Now, is the "sequentially consistent" ordering on is_tripped sufficient to guarantee that signals won't be missed on a weak-ordering platform? I *think* so, but an expert would need to check that code (or we cross our fingers and wait for a hypothetical bug report). Regards Antoine.
On Thu, Nov 2, 2017 at 2:22 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Wed, 1 Nov 2017 20:29:56 +0200 Koos Zevenhoven <k7hoven@gmail.com> wrote:
From a correctness point of view, that is absolutely great: if PyErr_CheckSignals() is called, it is guaranteed to notice a new signal regardles of how small the number of picoseconds after the `is_tripped` flag has been set. But is that really important?
I was going to answer "no"... but actually yes. The important case is the event loop type case:
while (1) { select([some file descriptors]); if (errno == EINTR) { PyErr_CheckSignals(); if (PyErr_Occurred()) break; } /* continue select()ing... */ }
Now say at a given point in time, no fds are actually active (or even waited for), but some signal arrives (SIGINT perhaps). select() is woken up and returns with errno EINTR. Two things then can happen:
- if PyErr_CheckSignals() notices the signal, it will run the relevant signal handler, which may raise an exception and trigger the select() loop to exit (e.g. SIGINT would raise KeyboardInterrupt)
- if PyErr_CheckSignals() misses the signal, the loop will enter again, and select() may sleep for an infinite amount of time
Oh! So that would provide a proper reason for my just-in-case decision to name the faster near-equivalent functionality PyErr_PROBE_SIGNALS instead of PyErr_CHECK_SIGNALS. Cross-referencing to that (thread about making Ctrl-C "always" work): https://mail.python.org/pipermail/python-ideas/2017-November/047631.html Of course, what we're doing with select() above can already apply for
read() or other interruptible syscalls waiting for outside data... and that pattern is present a lot internally, especially since PEP 475 ("Retry system calls failing with EINTR").
Now, is the "sequentially consistent" ordering on is_tripped sufficient to guarantee that signals won't be missed on a weak-ordering platform? I *think* so, but an expert would need to check that code (or we cross our fingers and wait for a hypothetical bug report).
I think the question is: Do we know for sure that is_tripped has been stored using sequentially consistent ordering prior to the call to PyErr_CheckSignals(), even if an interruptible syscall is involved? I suppose so? (But this is a separate question from the problem I was solving, of course. I'm not proposing to remove PyErr_CheckSignals()) ––Koos
Regards
Antoine.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- + Koos Zevenhoven + http://twitter.com/k7hoven +
On Thu, 2 Nov 2017 14:57:12 +0200 Koos Zevenhoven <k7hoven@gmail.com> wrote:
Now, is the "sequentially consistent" ordering on is_tripped sufficient to guarantee that signals won't be missed on a weak-ordering platform? I *think* so, but an expert would need to check that code (or we cross our fingers and wait for a hypothetical bug report).
I think the question is: Do we know for sure that is_tripped has been stored using sequentially consistent ordering prior to the call to PyErr_CheckSignals(), even if an interruptible syscall is involved?
Yes, it is. See trip_signal() in Modules/signalmodule.c Regards Antoine.
On Thu, Nov 2, 2017 at 2:57 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Thu, Nov 2, 2017 at 2:22 PM, Antoine Pitrou <solipsis@pitrou.net> wrote:
On Wed, 1 Nov 2017 20:29:56 +0200 Koos Zevenhoven <k7hoven@gmail.com> wrote:
From a correctness point of view, that is absolutely great: if PyErr_CheckSignals() is called, it is guaranteed to notice a new signal regardles of how small the number of picoseconds after the `is_tripped` flag has been set. But is that really important?
I was going to answer "no"... but actually yes. The important case is the event loop type case:
while (1) { select([some file descriptors]); if (errno == EINTR) { PyErr_CheckSignals(); if (PyErr_Occurred()) break; } /* continue select()ing... */ }
Now say at a given point in time, no fds are actually active (or even waited for), but some signal arrives (SIGINT perhaps). select() is woken up and returns with errno EINTR. Two things then can happen:
- if PyErr_CheckSignals() notices the signal, it will run the relevant signal handler, which may raise an exception and trigger the select() loop to exit (e.g. SIGINT would raise KeyboardInterrupt)
- if PyErr_CheckSignals() misses the signal, the loop will enter again, and select() may sleep for an infinite amount of time
Oh! So that would provide a proper reason for my just-in-case decision to name the faster near-equivalent functionality PyErr_PROBE_SIGNALS instead of PyErr_CHECK_SIGNALS.
Cross-referencing to that (thread about making Ctrl-C "always" work):
https://mail.python.org/pipermail/python-ideas/2017-November/047631.html
Of course, what we're doing with select() above can already apply for
read() or other interruptible syscalls waiting for outside data... and that pattern is present a lot internally, especially since PEP 475 ("Retry system calls failing with EINTR").
Now, is the "sequentially consistent" ordering on is_tripped sufficient to guarantee that signals won't be missed on a weak-ordering platform? I *think* so, but an expert would need to check that code (or we cross our fingers and wait for a hypothetical bug report).
I think the question is: Do we know for sure that is_tripped has been stored using sequentially consistent ordering prior to the call to PyErr_CheckSignals(), even if an interruptible syscall is involved? I suppose so?
(But this is a separate question from the problem I was solving, of course. I'm not proposing to remove PyErr_CheckSignals())
To continue on this: If I understand your question correctly, I'm hesitant to make strong statements about it. It would be interesting to know what we can assume about signals that happen at the same time with system calls, given the various platforms supported. Unfortunately, I don't know that. Maybe you are concerned about whether some nuances and recent changes to signal handling could lead to harmful change in behavior in some meaningful edge cases? I can at least say that my PyErr_PROBE_SIGNALS() proposal does not introduce such issues, if the difference is documented properly: """PyErr_PROBE_SIGNALS() is meant for performance-critical code and is not 100% guaranteed to always see the most recent signals. If a signal being deferred is a concern, use PyErr_CheckSignals() instead.""" But more generally, if we could assume that trip_signal() and PyErr_CheckSignals() always happen in the same "CPU thread", then we wouldn't need pyatomic.h here at all. The fact that the code currently assumes that all Python signal handlers should run in the same Python thread takes care of some of these concerns without needing locks etc. Some other concerns I can imagine by looking at some of the code in Modules/signalmodule.c: (1) If trip_signal() and PyErr_CheckSignals() are executed concurrently, trip_signal() might set a new signal flag (using relaxed memory order) while PyErr_CheckSignals is still running. Then if PyErr_CheckSignals() sets is_tripped to zero *after* trip_signal() sets it to 1, then the new signal might be deferred until the next time *some* new signal arrives, which could take an arbitrarily long amount of time, I suppose. However, it looks like this problem has been solved by always setting is_tripped to zero (with strict SEQ_CST memory order) *before* handling the individual signals. So if trip_signal() has already set is_tripped to 1 (with SEQ_CST), that prevents PyErr_CheckSignals from setting is_tripped to zero (with SEQ_CST) *and not* handling the signal. If trip_signal() has not yet finished, and therefore not set is_tripped to 1 yet, it will cause the next call to PyErr_CheckSignals to catch the signal. (2) Again, if trip_signal() and PyErr_CheckSignals() execute concurrently, it might happen that PyErr_CheckSignals() handles the signal *before* trip_signal sets is_tripped to 1. That would cause the next call to PyErr_CheckSignals() to think there's an unhandled signal, but will most likely not find one, because it was already handled on the previous call. But that just effectively means that nothing is done. In fact, there's a comment in the code that mentions this. (3, 4, ...) Of course there's more to take care of there, but that's unrelated to my PyErr_PROBE_SIGNALS() proposal. Anyway, at least (1) and (2) seem to already have been taken care of, and I assume you are aware of that. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
On Fri, 3 Nov 2017 17:22:07 +0200 Koos Zevenhoven <k7hoven@gmail.com> wrote:
Maybe you are concerned about whether some nuances and recent changes to signal handling could lead to harmful change in behavior in some meaningful edge cases? I can at least say that my PyErr_PROBE_SIGNALS() proposal does not introduce such issues, if the difference is documented properly:
"""PyErr_PROBE_SIGNALS() is meant for performance-critical code and is not 100% guaranteed to always see the most recent signals. If a signal being deferred is a concern, use PyErr_CheckSignals() instead."""
Agreed.
But more generally, if we could assume that trip_signal() and PyErr_CheckSignals() always happen in the same "CPU thread", then we wouldn't need pyatomic.h here at all. The fact that the code currently assumes that all Python signal handlers should run in the same Python thread takes care of some of these concerns without needing locks etc.
Hmm... *Python* signal handlers (registered with signal.signal()) all run in the same thread, but we have no way to ensure to ensure that trip_signal() (which is a C-level signal handler called by the OS) will run in the same thread. Even if we were to mask all signals in non-main threads started by Python (e.g. with threading.Thread), there can still be threads created by third-party C libraries that we don't have any control over.
Some other concerns I can imagine by looking at some of the code in Modules/signalmodule.c:
(1) If trip_signal() and PyErr_CheckSignals() are executed concurrently, trip_signal() might set a new signal flag (using relaxed memory order) while PyErr_CheckSignals is still running. Then if PyErr_CheckSignals() sets is_tripped to zero *after* trip_signal() sets it to 1, then the new signal might be deferred until the next time *some* new signal arrives, which could take an arbitrarily long amount of time, I suppose.
However, it looks like this problem has been solved by always setting is_tripped to zero (with strict SEQ_CST memory order) *before* handling the individual signals. So if trip_signal() has already set is_tripped to 1 (with SEQ_CST), that prevents PyErr_CheckSignals from setting is_tripped to zero (with SEQ_CST) *and not* handling the signal. If trip_signal() has not yet finished, and therefore not set is_tripped to 1 yet, it will cause the next call to PyErr_CheckSignals to catch the signal.
(2) Again, if trip_signal() and PyErr_CheckSignals() execute concurrently, it might happen that PyErr_CheckSignals() handles the signal *before* trip_signal sets is_tripped to 1. That would cause the next call to PyErr_CheckSignals() to think there's an unhandled signal, but will most likely not find one, because it was already handled on the previous call. But that just effectively means that nothing is done. In fact, there's a comment in the code that mentions this.
(3, 4, ...) Of course there's more to take care of there, but that's unrelated to my PyErr_PROBE_SIGNALS() proposal. Anyway, at least (1) and (2) seem to already have been taken care of, and I assume you are aware of that.
Yes, that was my analysis too (and why I implemented it this way in the first place). But I'm glad you confirm it, since on such topics it's always easy to miss a potential problem. Regards Antoine.
On 2017-10-17 07:26, Serhiy Storchaka wrote:
17.10.17 17:06, Nick Coghlan пише:
Keep in mind we're not talking about a regular loop you can break out of with Ctrl-C here - we're talking about a tight loop inside the interpreter internals that leads to having to kill the whole host process just to get out of it. And this is the root of the issue. Just let more tight loops be interruptible with Ctrl-C, and this will fix the more general issue.
I was just thinking the same thing. I think in general it's always bad for code to be uninterruptible with Ctrl-C. If these infinite iterators were fixed so they could be interrupted, this containment problem would be much less painful. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On Tue, Oct 17, 2017 at 9:44 PM, Brendan Barnwell <brenbarn@brenbarn.net> wrote:
On 2017-10-17 07:26, Serhiy Storchaka wrote:
17.10.17 17:06, Nick Coghlan пише:
Keep in mind we're not talking about a regular loop you can break out of with Ctrl-C here - we're talking about a tight loop inside the interpreter internals that leads to having to kill the whole host process just to get out of it.
And this is the root of the issue. Just let more tight loops be interruptible with Ctrl-C, and this will fix the more general issue.
I was just thinking the same thing. I think in general it's always bad for code to be uninterruptible with Ctrl-C.
Indeed I agree about this.
If these infinite iterators were fixed so they could be interrupted, this containment problem would be much less painful.
I'm unable to reproduce the "uninterruptible with Ctrl-C" problem with infinite iterators. At least itertools doesn't seem to have it:
import itertools for i in itertools.count(): ... pass ... ^CTraceback (most recent call last): File "<stdin>", line 1, in <module> KeyboardInterrupt
for i in itertools.repeat(1): ... pass ... ^CTraceback (most recent call last): File "<stdin>", line 1, in <module> KeyboardInterrupt
for i in itertools.cycle((1,)): ... pass ... ^CTraceback (most recent call last): File "<stdin>", line 1, in <module> KeyboardInterrupt
Same thing on both Windows and Linux, Python 3.6. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
On 18 October 2017 at 10:56, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I'm unable to reproduce the "uninterruptible with Ctrl-C" problem with infinite iterators. At least itertools doesn't seem to have it:
import itertools for i in itertools.count(): ... pass ... ^CTraceback (most recent call last): File "<stdin>", line 1, in <module> KeyboardInterrupt
That's not the issue here, as the CPython interpreter implements this with multiple opcodes, and checks between opcodes for Ctrl-C. The demonstration is:
import itertools 'x' in itertools.count()
... only way to break out is to kill the process. Paul
On 18 October 2017 at 19:56, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I'm unable to reproduce the "uninterruptible with Ctrl-C" problem with infinite iterators. At least itertools doesn't seem to have it:
import itertools for i in itertools.count(): ... pass ...
That's interrupting the for loop, not the iterator. This is the test case you want for the problem Jason raised: >>> "a" in itertools.count() Be prepared to suspend and terminate the affected process, because Ctrl-C isn't going to help :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Oct 18, 2017 13:29, "Nick Coghlan" <ncoghlan@gmail.com> wrote: On 18 October 2017 at 19:56, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I'm unable to reproduce the "uninterruptible with Ctrl-C" problem with infinite iterators. At least itertools doesn't seem to have it:
import itertools for i in itertools.count(): ... pass ...
That's interrupting the for loop, not the iterator. This is the test case you want for the problem Jason raised: >>> "a" in itertools.count() Be prepared to suspend and terminate the affected process, because Ctrl-C isn't going to help :) I'm writing from my phone now, cause I was dumb enough to try list(count()) But should it be fixed in list or in count? -- Koos PS. Nick, sorry about the duplicate email. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 18 October 2017 at 20:39, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Oct 18, 2017 13:29, "Nick Coghlan" <ncoghlan@gmail.com> wrote:
On 18 October 2017 at 19:56, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I'm unable to reproduce the "uninterruptible with Ctrl-C" problem with infinite iterators. At least itertools doesn't seem to have it:
import itertools for i in itertools.count(): ... pass ...
That's interrupting the for loop, not the iterator. This is the test case you want for the problem Jason raised:
>>> "a" in itertools.count()
Be prepared to suspend and terminate the affected process, because Ctrl-C isn't going to help :)
I'm writing from my phone now, cause I was dumb enough to try list(count())
Yeah, that's pretty much the worst case example, since the machine starts thrashing memory long before it actually gives up and starts denying the allocation requests :(
But should it be fixed in list or in count?
That one can only be fixed in count() - list already checks operator.length_hint(), so implementing itertools.count.__length_hint__() to always raise an exception would be enough to handle the container constructor case. The open question would then be the cases that don't pre-allocate memory, but still always attempt to consume the entire iterator: min(itr) max(itr) sum(itr) functools.reduce(op, itr) "".join(itr) And those which *may* attempt to consume the entire iterator, but won't necessarily do so: x in itr any(itr) all(itr) The items in the first category could likely be updated to check length_hint and propagate any errors immediately, since they don't provide any short circuiting behaviour - feeding them an infinite iterator is a guaranteed uninterruptible infinite loop, so checking for a length hint won't break any currently working code (operator.length_hint defaults to returning zero if a type doesn't implement __length_hint__). I'm tempted to say the same for the APIs in the latter category as well, but their short-circuiting semantics mean those can technically have well-defined behaviour, even when given an infinite iterator: >>> any(itertools.count()) True >>> all(itertools.count()) False >>> 1 in itertools.count() True It's only the "never short-circuits" branch that is ill-defined for non-terminating input. So for these, the safer path would be to emit DeprecationWarning if length_hint fails in 3.7, and then pass the exception through in 3.8+. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Wed, Oct 18, 2017 at 2:08 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 18 October 2017 at 20:39, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Oct 18, 2017 13:29, "Nick Coghlan" <ncoghlan@gmail.com> wrote:
On 18 October 2017 at 19:56, Koos Zevenhoven <k7hoven@gmail.com> wrote:
I'm unable to reproduce the "uninterruptible with Ctrl-C" problem with infinite iterators. At least itertools doesn't seem to have it:
import itertools for i in itertools.count(): ... pass ...
That's interrupting the for loop, not the iterator. This is the test case you want for the problem Jason raised:
>>> "a" in itertools.count()
Be prepared to suspend and terminate the affected process, because Ctrl-C isn't going to help :)
I'm writing from my phone now, cause I was dumb enough to try list(count())
Yeah, that's pretty much the worst case example, since the machine starts thrashing memory long before it actually gives up and starts denying the allocation requests :(
But should it be fixed in list or in count?
That one can only be fixed in count() - list already checks operator.length_hint(), so implementing itertools.count.__length_hint__() to always raise an exception would be enough to handle the container constructor case.
While that may be a convenient hack to solve some of the cases, maybe it's possible for list(..) etc. to give Ctrl-C a chance every now and then? (Without a noticeable performance penalty, that is.) That would also help with *finite* C-implemented iterables that are just slow to turn into a list. If I'm not mistaken, we're talking about C-implemented functions that iterate over C-implemented iterators. It's not at all obvious to me that it's the iterator that should handle Ctrl-C. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
On 18 October 2017 at 22:36, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Wed, Oct 18, 2017 at 2:08 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That one can only be fixed in count() - list already checks operator.length_hint(), so implementing itertools.count.__length_hint__() to always raise an exception would be enough to handle the container constructor case.
While that may be a convenient hack to solve some of the cases, maybe it's possible for list(..) etc. to give Ctrl-C a chance every now and then? (Without a noticeable performance penalty, that is.) That would also help with *finite* C-implemented iterables that are just slow to turn into a list.
If I'm not mistaken, we're talking about C-implemented functions that iterate over C-implemented iterators. It's not at all obvious to me that it's the iterator that should handle Ctrl-C.
It isn't, it's the loop's responsibility. The problem is that one of the core design assumptions in the CPython interpreter implementation is that signals from the operating system get handled by the opcode eval loop in the main thread, and Ctrl-C is one of those signals. This is why "for x in itertools.cycle(): pass" can be interrupted, while "sum(itertools.cycle())" can't: in the latter case, the opcode eval loop isn't running, as we're inside a tight loop inside the sum() implementation. It's easy to say "Well those loops should all be checking for signals then", but I expect folks wouldn't actually like the consequences of doing something about it, as: 1. It will make those loops slower, due to the extra overhead of checking for signals (even the opcode eval loop includes all sorts of tricks to avoid actually checking for new signals, since doing so is relatively slow) 2. It will make those loops harder to maintain, since the high cost of checking for signals means the existing flat loops will need to be replaced with nested ones to reduce the per-iteration cost of the more expensive checks 3. It means making the signal checking even harder to reason about than it already is, since even C implemented methods that avoid invoking arbitrary Python code could now still end up checking for signals It's far from being clear to me that making such a change would actually be a net improvement, especially when there's an opportunity to mitigate the problem by having known-infinite iterators report themselves as such. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Wed, Oct 18, 2017 at 5:48 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 18 October 2017 at 22:36, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Wed, Oct 18, 2017 at 2:08 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That one can only be fixed in count() - list already checks operator.length_hint(), so implementing itertools.count.__length_hint__() to always raise an exception would be enough to handle the container constructor case.
While that may be a convenient hack to solve some of the cases, maybe it's possible for list(..) etc. to give Ctrl-C a chance every now and then? (Without a noticeable performance penalty, that is.) That would also help with *finite* C-implemented iterables that are just slow to turn into a list.
If I'm not mistaken, we're talking about C-implemented functions that iterate over C-implemented iterators. It's not at all obvious to me that it's the iterator that should handle Ctrl-C.
It isn't, it's the loop's responsibility. The problem is that one of the core design assumptions in the CPython interpreter implementation is that signals from the operating system get handled by the opcode eval loop in the main thread, and Ctrl-C is one of those signals.
This is why "for x in itertools.cycle(): pass" can be interrupted, while "sum(itertools.cycle())" can't: in the latter case, the opcode eval loop isn't running, as we're inside a tight loop inside the sum() implementation.
It's easy to say "Well those loops should all be checking for signals then", but I expect folks wouldn't actually like the consequences of doing something about it, as:
1. It will make those loops slower, due to the extra overhead of checking for signals (even the opcode eval loop includes all sorts of tricks to avoid actually checking for new signals, since doing so is relatively slow) 2. It will make those loops harder to maintain, since the high cost of checking for signals means the existing flat loops will need to be replaced with nested ones to reduce the per-iteration cost of the more expensive checks
Combining points 1 and 2, I don't believe nesting loops is strictly a requirement.
3. It means making the signal checking even harder to reason about than it already is, since even C implemented methods that avoid invoking arbitrary Python code could now still end up checking for signals
So you're talking about code that would make a C-implemented Python iterable of strictly C-implemented Python objects and then pass this to something C-implemented like list(..) or sum(..), while expecting no Python code to be run or signals to be checked anywhere while doing it. I'm not really convinced that such code exists. But if such code does exist, it sounds like the code is heavily dependent on implementation details.
It's far from being clear to me that making such a change would actually be a net improvement, especially when there's an opportunity to mitigate the problem by having known-infinite iterators report themselves as such.
I'm not against that, per se. I just don't think that solves the quite typical case of *finite* but very tedious or memory-consuming loops that one might want to break out of. And raising an exception from .__length_hint__() might also break some obscure, but completely valid, operations on *infinite* iterables. ––Koos
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
-- + Koos Zevenhoven + http://twitter.com/k7hoven +
On 18 October 2017 at 16:27, Koos Zevenhoven <k7hoven@gmail.com> wrote:
So you're talking about code that would make a C-implemented Python iterable of strictly C-implemented Python objects and then pass this to something C-implemented like list(..) or sum(..), while expecting no Python code to be run or signals to be checked anywhere while doing it. I'm not really convinced that such code exists. But if such code does exist, it sounds like the code is heavily dependent on implementation details.
Well, the OP specifically noted that he had recently encountered precisely that situation: """ I recently came across a bug where checking negative membership (__contains__ returns False) of an infinite iterator will freeze the program. """ Paul
On Wed, Oct 18, 2017 at 6:36 PM, Paul Moore <p.f.moore@gmail.com> wrote:
So you're talking about code that would make a C-implemented Python iterable of strictly C-implemented Python objects and then pass this to something C-implemented like list(..) or sum(..), while expecting no Python code to be run or signals to be checked anywhere while doing it. I'm not really convinced that such code exists. But if such code does exist, it sounds
On 18 October 2017 at 16:27, Koos Zevenhoven <k7hoven@gmail.com> wrote: like
the code is heavily dependent on implementation details.
Well, the OP specifically noted that he had recently encountered precisely that situation:
""" I recently came across a bug where checking negative membership (__contains__ returns False) of an infinite iterator will freeze the program. """
No, __contains__ does not expect no python code to be run, because Python code *can* run, as Serhiy in fact already demonstrated for another purpose: On Wed, Oct 18, 2017 at 3:53 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
18.10.17 13:22, Nick Coghlan пише:
2.. These particular cases can be addressed locally using existing protocols, so the chances of negative side effects are low
Only the particular case `count() in count()` can be addressed without breaking the following examples:
class C: ... def __init__(self, count): ... self.count = count ... def __eq__(self, other): ... print(self.count, other) ... if not self.count: ... return True ... self.count -= 1 ... return False ... import itertools C(5) in itertools.count() 5 0 4 1 3 2 2 3 1 4 0 5 True
Clearly, Python code *does* run from within itertools.count.__contains__(..) ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
OK, looks like I've lost track of what this thread is about then. Sorry for the noise. Paul On 18 October 2017 at 16:40, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Wed, Oct 18, 2017 at 6:36 PM, Paul Moore <p.f.moore@gmail.com> wrote:
On 18 October 2017 at 16:27, Koos Zevenhoven <k7hoven@gmail.com> wrote:
So you're talking about code that would make a C-implemented Python iterable of strictly C-implemented Python objects and then pass this to something C-implemented like list(..) or sum(..), while expecting no Python code to be run or signals to be checked anywhere while doing it. I'm not really convinced that such code exists. But if such code does exist, it sounds like the code is heavily dependent on implementation details.
Well, the OP specifically noted that he had recently encountered precisely that situation:
""" I recently came across a bug where checking negative membership (__contains__ returns False) of an infinite iterator will freeze the program. """
No, __contains__ does not expect no python code to be run, because Python code *can* run, as Serhiy in fact already demonstrated for another purpose:
On Wed, Oct 18, 2017 at 3:53 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
18.10.17 13:22, Nick Coghlan пише:
2.. These particular cases can be addressed locally using existing protocols, so the chances of negative side effects are low
Only the particular case `count() in count()` can be addressed without breaking the following examples:
class C: ... def __init__(self, count): ... self.count = count ... def __eq__(self, other): ... print(self.count, other) ... if not self.count: ... return True ... self.count -= 1 ... return False ... import itertools C(5) in itertools.count() 5 0 4 1 3 2 2 3 1 4 0 5 True
Clearly, Python code *does* run from within itertools.count.__contains__(..)
––Koos
-- + Koos Zevenhoven + http://twitter.com/k7hoven +
On Wed, Oct 18, 2017 at 6:56 PM, Paul Moore <p.f.moore@gmail.com> wrote:
OK, looks like I've lost track of what this thread is about then. Sorry for the noise. Paul
No worries. I'm not sure I can tell what this thread is about either. Different people seem to have different ideas about that. My most recent point was that __contains__ already has to allow Python code to run on each iteration, so it is not the kind of code that Nick was referring to, and which I'm not convinced even exists. ––Koos
On Wed, Oct 18, 2017 at 6:36 PM, Paul Moore <p.f.moore@gmail.com> wrote:
On 18 October 2017 at 16:27, Koos Zevenhoven <k7hoven@gmail.com> wrote:
So you're talking about code that would make a C-implemented Python iterable of strictly C-implemented Python objects and then pass this to
something
C-implemented like list(..) or sum(..), while expecting no Python code to be run or signals to be checked anywhere while doing it. I'm not really convinced that such code exists. But if such code does exist, it sounds like the code is heavily dependent on implementation details.
Well, the OP specifically noted that he had recently encountered precisely that situation:
""" I recently came across a bug where checking negative membership (__contains__ returns False) of an infinite iterator will freeze the program. """
No, __contains__ does not expect no python code to be run, because Python code *can* run, as Serhiy in fact already demonstrated for another
On 18 October 2017 at 16:40, Koos Zevenhoven <k7hoven@gmail.com> wrote: purpose:
On Wed, Oct 18, 2017 at 3:53 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
18.10.17 13:22, Nick Coghlan пише:
2.. These particular cases can be addressed locally using existing protocols, so the chances of negative side effects are low
Only the particular case `count() in count()` can be addressed without breaking the following examples:
class C: ... def __init__(self, count): ... self.count = count ... def __eq__(self, other): ... print(self.count, other) ... if not self.count: ... return True ... self.count -= 1 ... return False ... import itertools C(5) in itertools.count() 5 0 4 1 3 2 2 3 1 4 0 5 True
Clearly, Python code *does* run from within
itertools.count.__contains__(..)
––Koos
-- + Koos Zevenhoven + http://twitter.com/k7hoven +
-- + Koos Zevenhoven + http://twitter.com/k7hoven +
On 2017-10-18 15:48, Nick Coghlan wrote:
On 18 October 2017 at 22:36, Koos Zevenhoven <k7hoven@gmail.com <mailto:k7hoven@gmail.com>> wrote:
On Wed, Oct 18, 2017 at 2:08 PM, Nick Coghlan <ncoghlan@gmail.com <mailto:ncoghlan@gmail.com>> wrote:
That one can only be fixed in count() - list already checks operator.length_hint(), so implementing itertools.count.__length_hint__() to always raise an exception would be enough to handle the container constructor case.
While that may be a convenient hack to solve some of the cases, maybe it's possible for list(..) etc. to give Ctrl-C a chance every now and then? (Without a noticeable performance penalty, that is.) That would also help with *finite* C-implemented iterables that are just slow to turn into a list.
If I'm not mistaken, we're talking about C-implemented functions that iterate over C-implemented iterators. It's not at all obvious to me that it's the iterator that should handle Ctrl-C.
It isn't, it's the loop's responsibility. The problem is that one of the core design assumptions in the CPython interpreter implementation is that signals from the operating system get handled by the opcode eval loop in the main thread, and Ctrl-C is one of those signals.
This is why "for x in itertools.cycle(): pass" can be interrupted, while "sum(itertools.cycle())" can't: in the latter case, the opcode eval loop isn't running, as we're inside a tight loop inside the sum() implementation.
It's easy to say "Well those loops should all be checking for signals then", but I expect folks wouldn't actually like the consequences of doing something about it, as:
1. It will make those loops slower, due to the extra overhead of checking for signals (even the opcode eval loop includes all sorts of tricks to avoid actually checking for new signals, since doing so is relatively slow) 2. It will make those loops harder to maintain, since the high cost of checking for signals means the existing flat loops will need to be replaced with nested ones to reduce the per-iteration cost of the more expensive checks
The re module increments a counter on each iteration and checks for signals when the bottom 12 bits are 0. The regex module increments a 16-bit counter on each iteration and checks for signals when it wraps around to 0.
3. It means making the signal checking even harder to reason about than it already is, since even C implemented methods that avoid invoking arbitrary Python code could now still end up checking for signals
It's far from being clear to me that making such a change would actually be a net improvement, especially when there's an opportunity to mitigate the problem by having known-infinite iterators report themselves as such.
On Wed, Oct 18, 2017 at 9:24 PM, MRAB <python@mrabarnett.plus.com> wrote:
The re module increments a counter on each iteration and checks for signals when the bottom 12 bits are 0.
The regex module increments a 16-bit counter on each iteration and checks for signals when it wraps around to 0.
Then I'd say that's a great solution, except that `regex` probably over-exaggerates the overhead of checking for signals, and that `re` module for some strange reason wants to make an additional bitwise and. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
18.10.17 17:48, Nick Coghlan пише:
1. It will make those loops slower, due to the extra overhead of checking for signals (even the opcode eval loop includes all sorts of tricks to avoid actually checking for new signals, since doing so is relatively slow) 2. It will make those loops harder to maintain, since the high cost of checking for signals means the existing flat loops will need to be replaced with nested ones to reduce the per-iteration cost of the more expensive checks 3. It means making the signal checking even harder to reason about than it already is, since even C implemented methods that avoid invoking arbitrary Python code could now still end up checking for signals
I have implemented signals checking for itertools iterators. [1] The overhead is insignificant because signals are checked only for every 0x10000-th item (100-4000 times/sec). The consuming loops are not changed because signals are checked on the producer's side. [1] https://bugs.python.org/issue31815
On Wed, Oct 18, 2017 at 10:13 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
18.10.17 17:48, Nick Coghlan пише:
1. It will make those loops slower, due to the extra overhead of checking for signals (even the opcode eval loop includes all sorts of tricks to avoid actually checking for new signals, since doing so is relatively slow) 2. It will make those loops harder to maintain, since the high cost of checking for signals means the existing flat loops will need to be replaced with nested ones to reduce the per-iteration cost of the more expensive checks 3. It means making the signal checking even harder to reason about than it already is, since even C implemented methods that avoid invoking arbitrary Python code could now still end up checking for signals
I have implemented signals checking for itertools iterators. [1] The overhead is insignificant because signals are checked only for every 0x10000-th item (100-4000 times/sec). The consuming loops are not changed because signals are checked on the producer's side.
Nice! Though I'd really like a general solution that other code can easily adopt, even third-party extension libraries. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
18.10.17 22:21, Koos Zevenhoven пише:
Nice! Though I'd really like a general solution that other code can easily adopt, even third-party extension libraries.
What is the more general solution? For interrupting C code you need to check signals manually, either in every loop, or in every iterator. It seems to me that the number of loops is larger than the number of iterators.
On Wed, Oct 18, 2017 at 10:30 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
18.10.17 22:21, Koos Zevenhoven пише:
Nice! Though I'd really like a general solution that other code can easily adopt, even third-party extension libraries.
What is the more general solution? For interrupting C code you need to check signals manually, either in every loop, or in every iterator. It seems to me that the number of loops is larger than the number of iterators.
Sorry, I missed this email earlier. Maybe a macro like Py_MAKE_THIS_LOOP_BREAKABLE_FOR_ME_PLEASE that you could insert wherever you think the code might be spending some time without calling any Python code. One could use it rather carelessly, at least more so than refcounts. Something like the macro you wrote, except that it would take care of the whole thing and not just the counting. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
On Wed, Oct 18, 2017 at 10:21 PM, Koos Zevenhoven <k7hoven@gmail.com> wrote:
On Wed, Oct 18, 2017 at 10:13 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
18.10.17 17:48, Nick Coghlan пише:
1. It will make those loops slower, due to the extra overhead of checking for signals (even the opcode eval loop includes all sorts of tricks to avoid actually checking for new signals, since doing so is relatively slow) 2. It will make those loops harder to maintain, since the high cost of checking for signals means the existing flat loops will need to be replaced with nested ones to reduce the per-iteration cost of the more expensive checks 3. It means making the signal checking even harder to reason about than it already is, since even C implemented methods that avoid invoking arbitrary Python code could now still end up checking for signals
I have implemented signals checking for itertools iterators. [1] The overhead is insignificant because signals are checked only for every 0x10000-th item (100-4000 times/sec). The consuming loops are not changed because signals are checked on the producer's side.
Nice! Though I'd really like a general solution that other code can easily adopt, even third-party extension libraries.
By the way, now that I actually read the BPO issue, it looks like the benchmarks were for 0x1000 (15 bits)? And why is everyone doing powers of two anyway? Anyway, I still don't think infinite iterables are the most common case where this problem occurs. Solving this in the most common consuming loops would allow breaking out of a lot of long loops regardless of which iterable type (if any) is being used. So I'm still asking which one should solve the problem. -- Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
On Wed, Oct 18, 2017 at 01:39:28PM +0300, Koos Zevenhoven wrote:
I'm writing from my phone now, cause I was dumb enough to try list(count())
You have my sympathies -- I once, due to typo, accidentally ran something like range(10**100) in Python 2.
But should it be fixed in list or in count?
Neither. There are too many other places this can break for it to be effective to try to fix each one in place. e.g. set(xrange(2**64)), or tuple(itertools.repeat([1])) Rather, I think we should set a memory limit that applies to the whole process. Once you try to allocate more memory, you get an MemoryError exception rather than have the OS thrash forever trying to allocate a terabyte on a 4GB machine. (I don't actually understand why the OS can't fix this.) Being able to limit memory use is a fairly common request, e.g. on Stackoverflow: https://stackoverflow.com/questions/30269238/limit-memory-usage https://stackoverflow.com/questions/2308091/how-to-limit-the-heap-size https://community.webfaction.com/questions/15367/setting-max-memory-for-pyth... And Java apparently has a commandline switch to manage memory: https://stackoverflow.com/questions/22887400/is-there-an-equivalent-to-java-... The problems with the resources module are that its effectively an interface to ulimit, which makes it confusing and platform-specific; it is no help to Windows users; it isn't used by default; and not many people know about it. (I know I didn't until tonight.) So here is my suggestion: 1. Let's add a function in sys to set the "maximum memory" available, for some definition of memory that makes the most sense on your platform. Ordinary Python programmers shouldn't have to try to decipher the ulimit interface. 2. Have a command line switch to set that value, e.g.: python3 -L 1073741824 # 1 GiB python3 -L 0 # machine-dependent limit python3 -L -1 # unlimited where the machine-dependent limit is set by the interpreter, depending on the amount of memory it thinks it has available. 3. For the moment, stick to defaulting to -L -1 "unlimited", but with the intention to change to -L 0 "let the interpreter decide" in some future release, after an appropriate transition period. On Linux, we can always run ulimit XXXX python3 but honestly I never know which option to give (maximum stack size? maximum virtual memory? why is there no setting for maximum real memory?) and that doesn't help Windows users. Thoughts? -- Steve
On 18 October 2017 at 21:38, Steven D'Aprano <steve@pearwood.info> wrote:
But should it be fixed in list or in count?
Neither. There are too many other places this can break for it to be effective to try to fix each one in place.
e.g. set(xrange(2**64)), or tuple(itertools.repeat([1]))
A great many of these call operator.length_hint() these days in order to make a better guess as to how much memory to pre-allocate, so while that still wouldn't intercept everything, it would catch a lot of them.
Rather, I think we should set a memory limit that applies to the whole process. Once you try to allocate more memory, you get an MemoryError exception rather than have the OS thrash forever trying to allocate a terabyte on a 4GB machine.
(I don't actually understand why the OS can't fix this.)
Trying to allocate enormous amounts of memory all at once isn't the problem, as that just fails outright with "Not enough memory": >>> data = bytes(2**62) Traceback (most recent call last): File "<stdin>", line 1, in <module> MemoryError The machine-killing case is repeated allocation requests that the operating system *can* satisfy, but require paging almost everything else out of RAM. And that's exactly what "list(infinite_iterator)" entails, since the interpreter will make an initial guess as to the correct size, and then keep resizing the allocation to 125% of its previous size each time it fills up (or so - I didn't check the current overallocation factor) . Per-process memory quotas *can* help avoid this, but enforcing them requires that every process run in a resource controlled sandbox. Hence, it's not a coincidence that mobile operating systems and container-based server environments already work that way, and the improved ability to cope with misbehaving applications is part of why desktop operating systems would like to follow the lead of their mobile and server counterparts :) So here is my suggestion:
1. Let's add a function in sys to set the "maximum memory" available, for some definition of memory that makes the most sense on your platform. Ordinary Python programmers shouldn't have to try to decipher the ulimit interface.
Historically, one key reason we didn't do that was because the `PyMem_*` APIs bypassed CPython's memory allocator, so such a limit wouldn't have been particularly effective. As of 3.6 though, even bulk memory allocations pass through pymalloc, making a Python level memory allocation limit potentially more viable (since it would pick up almost all of the interpeter's own allocations, even if it missed those in extension modules): https://docs.python.org/dev/whatsnew/3.6.html#optimizations Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Wed, Oct 18, 2017 at 10:43:57PM +1000, Nick Coghlan wrote:
Per-process memory quotas *can* help avoid this, but enforcing them requires that every process run in a resource controlled sandbox. Hence, it's not a coincidence that mobile operating systems and container-based server environments already work that way, and the improved ability to cope with misbehaving applications is part of why desktop operating systems would like to follow the lead of their mobile and server counterparts :)
Does this also fall under the sandbox definition? $ softlimit -m 1000000000 python3 Python 3.7.0a1+ (heads/master:bdaeb7d237, Oct 16 2017, 18:54:55) [GCC 4.8.5] on linux Type "help", "copyright", "credits" or "license" for more information.
[0] * 10000000000000 Traceback (most recent call last): File "<stdin>", line 1, in <module> MemoryError
People who are worried could make a python3 alias or use Ctrl-\. Stefan Krah
On 18 October 2017 at 22:51, Stefan Krah <stefan@bytereef.org> wrote:
On Wed, Oct 18, 2017 at 10:43:57PM +1000, Nick Coghlan wrote:
Per-process memory quotas *can* help avoid this, but enforcing them requires that every process run in a resource controlled sandbox. Hence, it's not a coincidence that mobile operating systems and container-based server environments already work that way, and the improved ability to cope with misbehaving applications is part of why desktop operating systems would like to follow the lead of their mobile and server counterparts :)
Does this also fall under the sandbox definition?
$ softlimit -m 1000000000 python3
Yeah, Linux offers good opt-in tools for this kind of thing, and the combination of Android and containerised server environments means they're only getting better. But we're still some time away from it being routine for your desktop to be well protected from memory management misbehaviour in arbitrary console or GUI applications. The resource module (which Steven mentioned in passing) already provides opt-in access to some of those features from within the program itself: https://docs.python.org/3/library/resource.html For example: >>> import sys, resource >>> data = bytes(2**32) >>> resource.setrlimit(resource.RLIMIT_DATA, (2**31, sys.maxsize)) >>> data = bytes(2**32) Traceback (most recent call last): File "<stdin>", line 1, in <module> MemoryError >>> resource.setrlimit(resource.RLIMIT_DATA, (sys.maxsize, sys.maxsize)) >>> data = bytes(2**32) (Bulk memory allocations start failing on my machine somewhere between 2**33 and 2**34, which is about what I'd expect, since it has 8 GiB of physical RAM installed) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Wed, Oct 18, 2017 at 02:51:37PM +0200, Stefan Krah wrote:
$ softlimit -m 1000000000 python3 [...] MemoryError
People who are worried could make a python3 alias or use Ctrl-\.
I just tried that on two different Linux computers I have, and neither have softlimit. Nor (presumably) would this help Windows users. -- Steve
Hi Steve, 2017-10-19 1:59 GMT+02:00 Steven D'Aprano <steve@pearwood.info>:
On Wed, Oct 18, 2017 at 02:51:37PM +0200, Stefan Krah wrote:
$ softlimit -m 1000000000 python3 [...] MemoryError
People who are worried could make a python3 alias or use Ctrl-\.
I just tried that on two different Linux computers I have, and neither have softlimit.
Yeah, not sure what "softlimit" is either. I'd suggest sticking to POSIX-standard ulimit or just stick something like this in the .pythonrc.py: import resource resource.setrlimit(resource.RLIMIT_DATA, (2 * 1024**3, 2 * 1024**3)) Nor (presumably) would this help Windows users.
I (quickly) tried to get something to work using the win32 package, in particular the win32job functions. However, it seems setting "ProcessMemoryLimit" using win32job.SetInformationJobObject had no effect (i.e. a subsequent win32job.QueryInformationJobObject still showed the limit as 0)? People with stronger Windows-fu may be aware what is going on here... Stephan
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, Oct 19, 2017 at 10:05:58AM +0200, Stephan Houben wrote:
2017-10-19 1:59 GMT+02:00 Steven D'Aprano <steve@pearwood.info>:
On Wed, Oct 18, 2017 at 02:51:37PM +0200, Stefan Krah wrote:
$ softlimit -m 1000000000 python3 [...] MemoryError
People who are worried could make a python3 alias or use Ctrl-\.
I just tried that on two different Linux computers I have, and neither have softlimit.
Yeah, not sure what "softlimit" is either.
Part of daemontools: https://cr.yp.to/daemontools/softlimit.html Stefan Krah
On Thu, Oct 19, 2017 at 10:05 AM, Stephan Houben <stephanh42@gmail.com> wrote:
Hi Steve,
2017-10-19 1:59 GMT+02:00 Steven D'Aprano <steve@pearwood.info>:
On Wed, Oct 18, 2017 at 02:51:37PM +0200, Stefan Krah wrote:
$ softlimit -m 1000000000 python3 [...] MemoryError
People who are worried could make a python3 alias or use Ctrl-\.
I just tried that on two different Linux computers I have, and neither have softlimit.
Yeah, not sure what "softlimit" is either. I'd suggest sticking to POSIX-standard ulimit or just stick something like this in the .pythonrc.py:
import resource resource.setrlimit(resource.RLIMIT_DATA, (2 * 1024**3, 2 * 1024**3))
Nor (presumably) would this help Windows users.
I (quickly) tried to get something to work using the win32 package, in particular the win32job functions. However, it seems setting "ProcessMemoryLimit" using win32job.SetInformationJobObject had no effect (i.e. a subsequent win32job.QueryInformationJobObject still showed the limit as 0)?
People with stronger Windows-fu may be aware what is going on here...
Stephan
I wasn't aware Windows was capable of setting such limits in a per-process fashion. You gave me a good idea for psutil: https://github.com/giampaolo/psutil/issues/1149 According to this cmdline tool: https://cr.yp.to/daemontools/softlimit.html ....the limit should kick in only when the system memory is full, whatever that means: <<-r n: Limit the resident set size to n bytes. This limit is not enforced unless physical memory is full.>> ...so that would explain why it had no effect. -- Giampaolo - http://grodola.blogspot.com
On Thu, Oct 19, 2017 at 9:05 AM, Stephan Houben <stephanh42@gmail.com> wrote:
I (quickly) tried to get something to work using the win32 package, in particular the win32job functions. However, it seems setting "ProcessMemoryLimit" using win32job.SetInformationJobObject had no effect (i.e. a subsequent win32job.QueryInformationJobObject still showed the limit as 0)?
Probably you didn't set the JOB_OBJECT_LIMIT_PROCESS_MEMORY flag. Here's an example that tests the process memory limit using ctypes to call VirtualAlloc, before and after assigning the current process to the Job. Note that the py.exe launcher runs python.exe in an anonymous Job that's configured to kill on close (i.e. python.exe is killed when py.exe exits) and for silent breakaway of child processes. In this case, prior to Windows 8 (the first version to support nested Job objects), assigning the current process to a new Job will fail, so you'll have to run python.exe directly, or use a child process via subprocess. I prefer the former, since a child process won't be tethered to the launcher, which could get ugly for console applications. import ctypes import winerror, win32api, win32job kernel32 = ctypes.WinDLL('kernel32', use_last_error=True) MEM_COMMIT = 0x1000 MEM_RELEASE = 0x8000 PAGE_READWRITE = 4 kernel32.VirtualAlloc.restype = ctypes.c_void_p kernel32.VirtualAlloc.argtypes = (ctypes.c_void_p, ctypes.c_size_t, ctypes.c_ulong, ctypes.c_ulong) kernel32.VirtualFree.argtypes = (ctypes.c_void_p, ctypes.c_size_t, ctypes.c_ulong) hjob = win32job.CreateJobObject(None, "") limits = win32job.QueryInformationJobObject(hjob, win32job.JobObjectExtendedLimitInformation) limits['BasicLimitInformation']['LimitFlags'] |= ( win32job.JOB_OBJECT_LIMIT_PROCESS_MEMORY) limits['ProcessMemoryLimit'] = 2**31 win32job.SetInformationJobObject(hjob, win32job.JobObjectExtendedLimitInformation, limits) addr0 = kernel32.VirtualAlloc(None, 2**31, MEM_COMMIT, PAGE_READWRITE) if addr0: mem0_released = kernel32.VirtualFree(addr0, 0, MEM_RELEASE) win32job.AssignProcessToJobObject(hjob, win32api.GetCurrentProcess()) addr1 = kernel32.VirtualAlloc(None, 2**31, MEM_COMMIT, PAGE_READWRITE) Result: >>> addr0 2508252315648 >>> mem0_released 1 >>> addr1 is None True >>> ctypes.get_last_error() == winerror.ERROR_COMMITMENT_LIMIT True
On Wed, Oct 18, 2017 at 2:38 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Wed, Oct 18, 2017 at 01:39:28PM +0300, Koos Zevenhoven wrote:
I'm writing from my phone now, cause I was dumb enough to try list(count())
You have my sympathies -- I once, due to typo, accidentally ran something like range(10**100) in Python 2.
Oh, I think I've done something like that too, and there are definitely still opportunities in Python 3 to ask for the impossible. But what I did now, I did "on purpose". For a split second, I really wanted to know how bad it would be. But a few minutes later I had little interest left in that ;). Rebooting a computer definitely takes longer than restarting a Python process.
But should it be fixed in list or in count?
Neither. There are too many other places this can break for it to be effective to try to fix each one in place.
To clarify, I was talking about allowing Ctrl-C to break it, which somebody had suggested. That would also help if the C-implemented iterable just takes a lot of time to generate the items. And for the record, I just tried
sum(itertools.count())
And as we could expect, it does not respect Ctrl-C either. ––Koos -- + Koos Zevenhoven + http://twitter.com/k7hoven +
Hi all, FWIW, I just tried the list(count()) experiment on my phone (Termux Python interpreter under Android). Python 3.6.2 (default, Sep 16 2017, 23:55:07) [GCC 4.2.1 Compatible Android Clang 5.0.300080 ] on linux Type "help", "copyright", "credits" or "license" for more information.
import itertools list(itertools.count()) Killed
Interestingly even the Termux app stays alive and otherwise the phone remains responsive and doesn't get hot. I am now sending this mail from that very phone. So this issue is not an issue on the world's most popular OS 😁 Stephan Op 18 okt. 2017 08:46 schreef "Brendan Barnwell" <brenbarn@brenbarn.net>:
On 2017-10-17 07:26, Serhiy Storchaka wrote:
17.10.17 17:06, Nick Coghlan пише:
Keep in mind we're not talking about a regular loop you can break out of with Ctrl-C here - we're talking about a tight loop inside the interpreter internals that leads to having to kill the whole host process just to get out of it.
And this is the root of the issue. Just let more tight loops be interruptible with Ctrl-C, and this will fix the more general issue.
I was just thinking the same thing. I think in general it's always bad for code to be uninterruptible with Ctrl-C. If these infinite iterators were fixed so they could be interrupted, this containment problem would be much less painful.
-- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
participants (17)
-
Antoine Pitrou
-
Brendan Barnwell
-
eryk sun
-
Giampaolo Rodola'
-
Greg Ewing
-
Jason Campbell
-
Koos Zevenhoven
-
Michel Desmoulin
-
MRAB
-
Nick Coghlan
-
Paul Moore
-
Serhiy Storchaka
-
Stefan Krah
-
Stephan Houben
-
Steven D'Aprano
-
Terry Reedy
-
אלעזר