Make start, stop, step properties on islice
Is there a reason that itertools.islice does not provide its start, stop and step values as attributes, similar to range? This seems like a sensible and useful thing to have, and would also allow islice's to have a __len__. Mathew
The start/stop/step sound like they might be nice. But that wouldn't give you a length, since you never know when an iterator will be exhausted. I feel like `len(islice(it, 1, 1_000_000))` telling you the "maximum possible length" is more a danger than a help. On Wed, Aug 12, 2020 at 4:41 AM Mathew Elman <mathew.elman@ocado.com> wrote:
Is there a reason that itertools.islice does not provide its start, stop and step values as attributes, similar to range? This seems like a sensible and useful thing to have, and would also allow islice's to have a __len__.
Mathew _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UQD5GK... Code of Conduct: http://python.org/psf/codeofconduct/
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
You're absolutely right, I realized that __len__ would be the maximum possible length after posting, and it would likely be more dangerous than helpful
On Wed, Aug 12, 2020 at 8:49 AM David Mertz <mertz@gnosis.cx> wrote:
The start/stop/step sound like they might be nice. But that wouldn't give you a length, since you never know when an iterator will be exhausted. I feel like `len(islice(it, 1, 1_000_000))` telling you the "maximum possible length" is more a danger than a help.
I think islice should implement __length_hint__, though. As of 3.8.5 it doesn't.
On Thu, Aug 13, 2020 at 12:27 PM Ben Rudiak-Gould <benrudiak@gmail.com> wrote:
I think islice should implement __length_hint__, though. As of 3.8.5 it doesn't.
And it could support __len__, and raise an Exception when the underlying iterable doesn’t support it. I know that itertools needs to support arbitrary iterable, but I do wish it provided more Sequence features when it could. islice, for instance, could support negative indexes when it is wrapping a Sequence. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Thu, Aug 13, 2020 at 12:38:38PM -0700, Christopher Barker wrote:
On Thu, Aug 13, 2020 at 12:27 PM Ben Rudiak-Gould <benrudiak@gmail.com> wrote:
I think islice should implement __length_hint__, though. As of 3.8.5 it doesn't.
And it could support __len__, and raise an Exception when the underlying iterable doesn’t support it.
"Iterators should support len" is one of those features that everyone thinks they want, but nobody can show how it is actually workable as a language or library feature in the most general case. It might, sometimes, be workable for a specific custom iterator that you control yourself, in an application. But as a library feature, it is unusable. The problem is that the concept of "the length of an iterator" is unworkable in general, leading to confusing and contradictory behaviour. This seem reasonable at first: orig = 'abcd' it = iter(orig) assert len(it) == 4 but as soon as you begin to iterate over the iterator, we run into trouble. Should `len(it)` return the length of the original sequence, or track the number of currently remaining elements? Both cases are troublesome. (1) We return the length of the original sequence. Then we have the surprising result that `len(it) != len(list(it))`. item = next(it) assert item == 'a' assert len(it) == len(orig) list(it) # ['b', 'c', 'd'] has length 3, not 4 This violates the critical invariant that if an iterable has length N, then iterating over it (with no early exit) will take N loops. count = 0 n = len(it) for x in it: count += 1 assert count == n If the assertion fails, then your len function lied to you, and we will have a lot of bug reports that len is inaccurate. (2) We track the remaining items in the iterator. Then we violate the critical invariant that the length of a sequence (or sequence-like object) should not depend on whether you have iterated over it or not. it = iter(orig) assert len(it) == 4 for x in it: pass assert len(it) == 0 Why is this a critical invariant? Because otherwise we introduce a surprising temporal coupling in your code, making algorithms fragile and likely buggy. The length of the iterator depends on whether we check it before or after the loop, which is bad: def average(iterable): return sum(iterable)/len(iterable) def average(iterable): n = len(iterable) return sum(iterable)/n If those two functions don't give the same result, then your length calculation is broken. Whichever strategy we pick for the length of an iterator, we're going to surprise people and lead to fragile, buggy code. The easy cases: it = iter(sequence) n = len(it) for item in it: process(item) print(f"Processed {n} items") where we work with a fresh iterator, retrieve the length *before* iterating, and then *only* iterate fully to completion, might work okay, but as soon as you get to more complex cases the idea of len for iterators is a minefield. The only generally correct solution is to not pick either strategy (1) or strategy (2), both of which are sometimes what the caller expects but sometimes leads to surprising results and fragile, broken code, but instead refuse to guess. -- Steven
On 12.08.20 10:37, Mathew Elman wrote:
Is there a reason that itertools.islice does not provide its start, stop and step values as attributes, similar to range? This seems like a sensible and useful thing to have, and would also allow islice's to have a __len__.
Not all iterators need to be finite. E.g. `it.islice(it.count(), 5, None)` skips the first five elements but is yet infinite, so it doesn't have a length.
participants (6)
-
Ben Rudiak-Gould
-
Christopher Barker
-
David Mertz
-
Dominik Vilsmeier
-
Mathew Elman
-
Steven D'Aprano