[Python-ideas] Introduce collections.Reiterable
mistersheik at gmail.com
Fri Sep 20 10:10:35 CEST 2013
On Fri, Sep 20, 2013 at 1:15 AM, Stephen J. Turnbull <stephen at xemacs.org>wrote:
> Neil Girdhar writes:
> > Most importantly, there's no sure *and* easy way to assert that the
> > input a function is reiterable, and so the silent breakages are
> > hard to discover.
> But you haven't defined "reiterable" yet, except "it fixes the
> breakage I've experienced".
> "Reiterable" could mean that the same object can be passed to
> iteration contexts freely, and in each one it will start from the
> beginning and the context will receive the same sequence of objects
> (in the same order). Or order might not be guaranteed. It might mean
> that the same object once exhausted can be passed to another iteration
> context and it will restart. Or it might mean that the object
> supports a rewind method that must be explicitly called, but can be
> called even if the object hasn't been exhausted. Or it might mean
> that the object is clonable, and functions that iterate objects passed
> into them must clone them unless they know that the object will never
> be reiterated.
Many different solutions would fix the problems I've seen. My suggestion is
that Reiterable should be define as an iteratable for which calling the
__iter__ method yields the same elements in the same order irrespective of
whether __iter__ is called while a previously returned iterator is still
iterating. That way Antoine's above code would turn any non-reiterable
into a reiterable of this strong definition. Correct me if I'm wrong, but
views on dicts are reiterable.
All of the above also have concurrent variations: in a threading
> context, multiple threads have access to each object and might be
> iterating with arbitrary timing. (Eg, if a program is rewritten to
> use threads, a sequential reiteration could easily become a
> parallel/concurrent reiteration.) Oh, another: AFAIK even
> non-iterator iterables may change their content when iterated. Eg,
> weak containers: I forget if there are any iterables that allow
> deletions and insertions in underlying containers, but in the case of
> a weak ref deleting a ref elsewhere may cause the ref itself to
> disappear. Should "reiterable" provide any guarantees there?
I think it shouldn't because Sequence doesn't guarantee that
x = len(a)
a[x-1] = 5
won't throw if e.g., f does something to a.
> > Even if the user should be the one deciding what to do, the dev has
> > to be able to assert that the right thing was done.
> But there's no such need *between user and dev*. Assertions protect a
> dev from *herself*. Users do what they do, and devs either protect
> themselves from user vagaries, or they don't. If the dev wants to
> protect herself from undesirable user choices, cloning an iterator in
> Python should be cheap. If it isn't, let's fix that.
the dev/user terminology I think is unfortunate. In my case, I'm both the
dev and the user. I think a better way to word is that I personally want
to encapsulate the double-iteration in the member function. I don't want
to have to know how my iterable is going to be used as a caller.
Your second point that the method should be able to cheaply clone an
iterator cheaply is precisely what I'd like to achieve with a "Reiterator"
class like Antoine's. Its problem is that it makes an assumption that
non-iterator iterables are reiterable, which is not promised. For that
class to work, that should either be promised or another mechanism should
be provided to satisfy its initial check.
> Assertions are useful, indeed. But in this case, where the assertion
> itself is based on an undefined term as far as I can tell (I suspect
> this is because different use cases actually want different
> definitions), rather than an assertion the dev should treat herself
> (as a writer of other modules) as a user. Ie, she should protect
> herself from herself in the same way by cloning the iterator (for
> efficiency converting the iterable to an iterator then cloning).
It sounds like you're saying put the items into a list no matter what.
That's what I was doing before this thread. I just thought it would be
more efficient if the object were a view, a list, a tuple, or a numpy
array, for the code to elide the list construction. This could be achieved
as described above.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-ideas