On Tue, Oct 06, 2020 at 10:25:01PM -0700, Guido van Rossum wrote:
On Tue, Oct 6, 2020 at 18:16 Steven D'Aprano
wrote: For `__advance__` to be an official Python protocol, it would almost certainly have to be of use for *general purpose iterators*, not just specialised ones -- and probably not *hypothetical* iterators which may not even exist. Do you have concrete examples of your skip list and tree iterators that are in wide-spread use?
Yeah, I’m still waiting for the real use case to be revealed. (There my well be one.)
Yes that's the critical thing -- we could expose the internal state of the list operator if it was desirable. It is already exposed as a (private?) dunder: py> it = iter([10, 20, 30, 40, 50]) py> next(it) 10 py> it.__setstate__(3) py> next(it) 40 py> it.__setstate__(0) py> next(it) 10 but once we expose it, we can't easily change our mind again. So it is reasonable to be a bit cautious about locking this interface in as a public feature. (Aside: I'm actually rather surprised that it's exposed as a dunder.)
Specialised iterators can create whatever extra APIs they want to support, but the official iterator protocol intentionally has a very basic API:
- anything with an `__iter__` method which returns itself; - and a `__next__` method that returns the next value, raising StopIteration when exhausted.
This is a bare minimum needed to make an iterator, and we like it that way. For starters, it means that generators are iterators.
There’s a precedent though, __length_hint__ (PEP 424).
The OP anticipated this, with “[a] function which dispatches to a dunder __advance__ method (if one exists) or, as a fallback, calls next repeatedly.” Clearly the function would be on a par with len() and next().
Sure, and I appreciate that we could offer an O(N) fallback. Your point is well taken. There's precedent with the `in` operator too, which falls back on iteration and equality if no `__contains__` method is defined. I'm not sure that either len or next are good precedents? As far as I can tell, len() does not call `__length_hint__`, and next() only dispatches to `__next__`.
If people want to supply objects that support the iterator protocol but also offer a rich API including:
- peek - restart - previous - jump ahead (advance)
all features that have been proposed, there is nothing stopping you from adding those features to your iterator classes. But they all have problems if considered to be necessary for *all* iterators.
Strawman, since all except advance() would require some kind of buffering to build them out of the basics.
It's hardly a strawman when people actually have requested each of those as extensions to the iterator protocol! Don't make me go hunting for references :-) As for the buffering issue, sure, that's a point against those proposals, but itertools provides a tee function that buffers the iterator. So "needs a buffer" is not necessarily a knock-down objection to these features, even for the std lib.
I would expect that, given a sufficiently compelling real-world use-case, we would be prepared to add a jump ahead method to list-iterators, as a specific feature of that iterator, not of all iterators.
But the advance() function could support all iterators using the OP’s fallback.
Sure. We could do that. What's the interface? Is this a skip ahead by N steps, or skip directly to state N? I can imagine uses for both. Can we skip backwards if the underlying list supports it? `listiter.__setstate__` supports the second interface. There's no getstate dunder that I can see. Should there be? Here's a cautionary tale to suggest some caution. Back in the days when Python's PRNG was Wichmann-Hill, we added a `jumpahead(n)` interface to step forward n steps more efficiently than just calling random n times. This lasted exactly two releases, 2.1 and 2.2, before the PRNG changed and stepping forward n steps efficiently was no longer possible, and the method changed to essentially ignore n and just jump ahead to some distant state. https://docs.python.org/release/2.3/lib/module-random.html I'm not arguing against this proposal, or for it. I'm just mentioning some considerations which should be considered :-) -- Steve