__setstate__, a generic __getstate__, listiter.__setstate__, (typed) arrays and memoryviews

```python
import collections.abc
from array import array

list_ = [0, 10, 20]
assert [list_[i] for i in range(len(list_))] == [0, 10, 20]

iterator = iter(list_)
assert [next(iterator) for n in range(2)] == [0, 10]


iterator = iter(list_)
assert iterator.__reduce__() == (iter, (list_,), 0)
assert next(iterator) == 0
assert iterator.__reduce__() == (iter, (list_,), 1)
assert next(iterator) == 10
assert iterator.__reduce__() == (iter, (list_,), 2)
iterator.__setstate__(0)
assert iterator.__reduce__() == (iter, (list_,), 0)
assert next(iterator) == 0
assert next(iterator) == 10
assert next(iterator) == 20
assert iterator.__reduce__() == (iter, (list_,), 3)
assert iterator.__reduce__() == (iter, (list_,), len(list_))
try:
    next(iterator)
except StopIteration:
    pass
assert iterator.__reduce__() == (iter, ([],))
iterator.__setstate__(1)
try:
    assert next(iterator) == 10
except StopIteration:
    pass
iterator = iter(list_)
iterator.__setstate__(1)
assert next(iterator) == 10
assert iterator.__reduce__() == (iter, (list_,), 2)


try:
    [1, 2, 3].__reduce__()
    [1, 2, 3].__reduce_ex__(0)
    [1, 2, 3].__reduce_ex__(1)
except TypeError as e:
    assert e.args[0] == "can't pickle list objects"
[1, 2, 3].__reduce_ex__(2)


def __getstate__(obj):
    if (not isinstance(obj, collections.abc.Iterable)
            or isinstance(obj, list)):
        raise TypeError('__getstate__ only works with iterables', type(obj))
    reducefunc = getattr(obj, '__reduce__ex__', False)
    reduceoutput = reducefunc(2) if reducefunc else obj.__reduce__()
    if len(reduceoutput) < 3:
        raise StopIteration  # ?
    return reduceoutput[2]


iterator = iter(list_)
assert __getstate__(iterator) == 0
next(iterator)
assert __getstate__(iterator) == 1
next(iterator)
assert __getstate__(iterator) == 2
next(iterator)
assert __getstate__(iterator) == 3
try:
    next(iterator)
except StopIteration:
    pass
try:
    __getstate__(iterator)
except StopIteration:
    pass

iterator = iter(list_)
assert __getstate__(iterator) == 0
assert next(iterator) == 0
assert __getstate__(iterator) == 1
iterator.__setstate__(0)
assert __getstate__(iterator) == 0
assert next(iterator) == 0
assert __getstate__(iterator) == 1

try:
    __getstate__([1, 2, 3])
except TypeError as e:
    assert e.args[0] == "__getstate__ only works with iterables"
    assert e.args[1] == list, e.args[1]  # list_iterator; type(iter(list()))
    pass


# arrays must be typed;
# otherwise random access isn't possible
# because skipping ahead or back by n*size requires n calls to size(array[n])
list_ary = array('i', list_)
iterator = iter(list_ary)
assert [next(iterator) for n in range(2)] == [0, 10]

ary_memoryview = memoryview(list_ary)
iterator = iter(ary_memoryview)
assert [next(iterator) for n in range(2)] == [0, 10]

assert ary_memoryview.obj == list_ary
assert ary_memoryview.tolist() == list_

assert ary_memoryview[1] == 10
ary_memoryview[1] = 100
assert ary_memoryview[1] == 100
assert list_ary[1] == 100
assert ary_memoryview[:2].tolist() == [0, 100]
list_ary[1] = 1000
assert ary_memoryview[1] == 1000
assert ary_memoryview[:2].tolist() == [0, 1000]


ary_memoryview.release()
try:
    ary_memoryview[:2].tolist()
except ValueError as e:
    assert e.args[0] == "operation forbidden on released memoryview object"


list_ = [0, 10, 20]
iterable = iter(list_)
assert next(iterable) == 0
list_.insert(1, 5)
assert next(iterable) == 5
```

- https://docs.python.org/3/library/pickle.html#object.__setstate__

- listiter_setstate:
  https://github.com/python/cpython/blob/v3.10.0a1/Objects/listobject.c#L3215-L3229 :

```c
static PyObject *
listiter_setstate(listiterobject *it, PyObject *state)
{
    Py_ssize_t index = PyLong_AsSsize_t(state);
    if (index == -1 && PyErr_Occurred())
        return NULL;
    if (it->it_seq != NULL) {
        if (index < 0)
            index = 0;
        else if (index > PyList_GET_SIZE(it->it_seq))
            index = PyList_GET_SIZE(it->it_seq); /* iterator exhausted */
        it->it_index = index;
    }
    Py_RETURN_NONE;
}
```c


On Wed, Oct 7, 2020 at 11:32 AM Guido van Rossum <guido@python.org> wrote:
On Wed, Oct 7, 2020 at 2:13 AM Steven D'Aprano <steve@pearwood.info> wrote:
[about `__setstate__`]
(Aside: I'm actually rather surprised that it's exposed as a dunder.)

It's used for pickling. Someone long ago must have complained that list iterators weren't picklable, and we complied.

I'm not sure that either len or next are good precedents? As far as I
can tell, len() does not call `__length_hint__`, and next() only
dispatches to `__next__`.

I just meant that these are examples of a common pattern in Python, of a *function* wrapping a dunder *method*. Your example (`in` -> `__contains__` with a fallback if that doesn't exist) is better because it shows that a fallback is a known pattern; but it isn't exactly a function.

As for the buffering issue, sure, that's a point against those
proposals, but itertools provides a tee function that buffers the
iterator. So "needs a buffer" is not necessarily a knock-down objection
to these features, even for the std lib.

Well, the buffering requires forethought (you can't call go back unless you had the forethought to set up a buffer first) and consumes memory (which iterators are meant to avoid) so the argument against these is much stronger, and different from the argument against advance() -- the latter's presence costs nothing unless you call it.
 
What's the interface? Is this a skip ahead by N steps, or skip directly
to state N? I can imagine uses for both.

Not all iterators remember how often next() was called, so "skip to state N" is not a reasonable API. The only reasonable thing advance(N) can promise is to be equivalent to calling next() N times.
 
Can we skip backwards if the underlying list supports it?

We shouldn't allow this, since it wouldn't work if the input iterator was changed from a list iterator to e.g. a generator.
 
`listiter.__setstate__` supports the second interface. There's no
getstate dunder that I can see. Should there be?

It's called `__reduce__`. These are used for pickling and the state they pass around is supposed to be opaque.
 
Here's a cautionary tale to suggest some caution. [...]

I guess the worst that could happen in our case is that some class used to be implemented on top of a list and at some point changed to a linked list, and the performance of advance(N) changed from O(1) to O(N). But that's not going to happen to Python's fundamental data types (list, tuple, bytes, str, array), since (for better or for worse) they have many other aspects of their API (notably indexing and slicing) that would change from O(1) to O(N) if the implementation changed to something other than an array.

I'm not arguing against this proposal, or for it. I'm just mentioning
some considerations which should be considered :-)

Same here, for sure. Still waiting for that real-world use case... :-)

--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/3H2XFZWNQ5DSZKFX6S3MP7MAJG7KMEX4/
Code of Conduct: http://python.org/psf/codeofconduct/