Proposal: allow length_hint to specify infinite iterators

PEP 424 allows iterators to optionally offer a hint as to how long they will be: https://www.python.org/dev/peps/pep-0424/ Unfortunately, there's no good way for an iterator to report that it is infinitely long. Consequently, even those which are known to be infinite report finite lengths: py> from itertools import count py> from operator import length_hint py> infinite = count() py> length_hint(infinite) 0 This wastes the opportunity to fail fast on operations which cannot possibly succeed, e.g. list(count()) must eventually fail with MemoryError. Or worse: if the OS starts thrashing trying to meet the memory requests, you can lock up the computer. I propose that we: (1) extend the __length_hint__ protocol to allow iterators to report that they are infinite; (2) and recommend that consumers of iterators (such as list) that require finite input should fail fast in the event of an infinite iterator. Four possible ways that __length_hint__ and operator.length_hint might signal an infinite iterator: (a) return a negative value such as -1 (this is currently an error); (b) return some special sentinel value; (c) take the convention that returning sys.maxint means infinity; (d) raise an exception. The advantage of (d) is that consumers of check __length_hint__ don't need to do anything special to fail fast on infinite iterators: py> class Thing: ... def __length_hint__(self): ... raise ValueError('infinite') ... def __iter__(self): ... return count() ... py> x = Thing() py> list(x) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in __length_hint__ ValueError: infinite but if they can cope with such, they can explicitly catch the exception. Thoughts? -- Steve

29.11.17 03:34, Steven D'Aprano пише:
This wastes the opportunity to fail fast on operations which cannot possibly succeed, e.g. list(count()) must eventually fail with MemoryError. Or worse: if the OS starts thrashing trying to meet the memory requests, you can lock up the computer.
I propose that we:
(1) extend the __length_hint__ protocol to allow iterators to report that they are infinite;
(2) and recommend that consumers of iterators (such as list) that require finite input should fail fast in the event of an infinite iterator.
Infinite iterators are rare. And count() is even more special in the sense that iterating it doesn't have side effects and is not interruptible.
(c) take the convention that returning sys.maxint means infinity;
Returning sys.maxsize will likely lead to failing fast with MemoryError.

On Wed, 29 Nov 2017 09:39:57 +0200 Serhiy Storchaka <storchaka@gmail.com> wrote:
29.11.17 03:34, Steven D'Aprano пише:
This wastes the opportunity to fail fast on operations which cannot possibly succeed, e.g. list(count()) must eventually fail with MemoryError. Or worse: if the OS starts thrashing trying to meet the memory requests, you can lock up the computer.
I propose that we:
(1) extend the __length_hint__ protocol to allow iterators to report that they are infinite;
(2) and recommend that consumers of iterators (such as list) that require finite input should fail fast in the event of an infinite iterator.
Infinite iterators are rare. And count() is even more special in the sense that iterating it doesn't have side effects and is not interruptible.
Not to mention that many infinite iterators cannot be predicted in advance to be infinite. Only the more trivial cases such as count() would apply (but not a takewhile() involving count(), for example). Regards Antoine.
participants (3)
-
Antoine Pitrou
-
Serhiy Storchaka
-
Steven D'Aprano