[Python-ideas] Introduce collections.Reiterable

Fri Sep 20 10:10:35 CEST 2013

On Fri, Sep 20, 2013 at 1:15 AM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Neil Girdhar writes:
>
>  > Most importantly, there's no sure *and* easy way to assert that the
>  > input a function is reiterable, and so the silent breakages are
>  > hard to discover.
>
> But you haven't defined "reiterable" yet, except "it fixes the
> breakage I've experienced".
>
> "Reiterable" could mean that the same object can be passed to
> iteration contexts freely, and in each one it will start from the
> beginning and the context will receive the same sequence of objects
> (in the same order).  Or order might not be guaranteed.  It might mean
> that the same object once exhausted can be passed to another iteration
> context and it will restart.  Or it might mean that the object
> supports a rewind method that must be explicitly called, but can be
> called even if the object hasn't been exhausted.  Or it might mean
> that the object is clonable, and functions that iterate objects passed
> into them must clone them unless they know that the object will never
> be reiterated.
>
>
Many different solutions would fix the problems I've seen. My suggestion is
that Reiterable should be define as an iteratable for which calling the
__iter__ method yields the same elements in the same order irrespective of
whether __iter__ is called while a previously returned iterator is still
iterating.  That way Antoine's above code would turn any non-reiterable
into a reiterable of this strong definition.  Correct me if I'm wrong, but
views on dicts are reiterable.

All of the above also have concurrent variations: in a threading
> context, multiple threads have access to each object and might be
> iterating with arbitrary timing.  (Eg, if a program is rewritten to
> use threads, a sequential reiteration could easily become a
> parallel/concurrent reiteration.)  Oh, another: AFAIK even
> non-iterator iterables may change their content when iterated.  Eg,
> weak containers: I forget if there are any iterables that allow
> deletions and insertions in underlying containers, but in the case of
> a weak ref deleting a ref elsewhere may cause the ref itself to
> disappear.  Should "reiterable" provide any guarantees there?
>

I think it shouldn't because Sequence doesn't guarantee that

x = len(a)
f(a)
a[x-1] = 5

won't throw if e.g., f does something to a.

>
>  > Even if the user should be the one deciding what to do, the dev has
>  > to be able to assert that the right thing was done.
>
> But there's no such need *between user and dev*.  Assertions protect a
> dev from *herself*.  Users do what they do, and devs either protect
> themselves from user vagaries, or they don't.  If the dev wants to
> protect herself from undesirable user choices, cloning an iterator in
> Python should be cheap.  If it isn't, let's fix that.
>

the dev/user terminology I think is unfortunate.  In my case, I'm both the
dev and the user.  I think a better way to word is that I personally want
to encapsulate the double-iteration in the member function.  I don't want
to have to know how my iterable is going to be used as a caller.

Your second point that the method should be able to cheaply clone an
iterator cheaply is precisely what I'd like to achieve with a "Reiterator"
class like Antoine's.  Its problem is that it makes an assumption that
non-iterator iterables are reiterable, which is not promised.  For that
class to work, that should either be promised or another mechanism should
be provided to satisfy its initial check.

>
> Assertions are useful, indeed.  But in this case, where the assertion
> itself is based on an undefined term as far as I can tell (I suspect
> this is because different use cases actually want different
> definitions), rather than an assertion the dev should treat herself
> (as a writer of other modules) as a user.  Ie, she should protect
> herself from herself in the same way by cloning the iterator (for
> efficiency converting the iterable to an iterator then cloning).
>

It sounds like you're saying put the items into a list no matter what.
 That's what I was doing before this thread.  I just thought it would be
more efficient if the object were a view, a list, a tuple, or a numpy
array, for the code to elide the list construction.  This could be achieved
as described above.

Best,
Neil

> Regards,
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130920/00a0191b/attachment.html>