[Python-Dev] Re: Reiterability

Sat Oct 18 15:46:20 EDT 2003

On Saturday 18 October 2003 07:17 pm, Guido van Rossum wrote:
   ...
> > offered by any given iterator type.  E.g., the presence of a special
> > method __reiter__ could indicate that this iterator IS able to
> > supply another iterator which retraces the same steps from the
   ...
> In cases where reiterabiliy can be implemented without much effort,
> there is already an underlying object representing the sequence
> (e.g. a collection object, or an object defining a numerical series).

...or a generator that needs to be called again, with the same parameters.

> Reiteration comes for free if you hold on to that underlying object
> rather than passing an iterator to them around.

Yes, but you need to pass around a somewhat complicated thing --
the iterator (to have the "current state in the iteration"), the callable
that needs to be called to generate the iterator again (iter, or the
generator, or the class whose instances are numerical series, ...)
and the arguments for that callable (the sequence, the generator's
arguments, the parameters with which to instantiate the class, ...).

Nothing terrible, admittedly, and that's presumably how I'd architect
things IF I ever met a use case for a "reiterable iterator":

class ReiterableIterator(object):
    def __init__(self, thecallable, *itsargs, **itskwds):
        self.c, self.a, self.k = thecallable, itsargs, itskwds
        self.it = thecallable(*itsargs, **itskwds)
    def __iter__(self): return self
    def next(self): return self.it.next()
    def reiter(self): return self.__class__(self.c, *self.a, **self.k)

typical toy example use:

def printwice(n, reiter):
    for i, x in enumerate(reiter):
        if i>=n: break
        print x
    for i, x in enumerate(reiter.reiter()):
        if i>=n: break
        print x

def evens():
    x = 0
    while 1:
        yield x
        x += 2

printwice(5, ReiterableIterator(evens))

> > "Should iterator expressions preserve the reiterability of the base
> > expression?"
>
> (An iterator expression being something like
>
>   (f(x) for x in S)
>
> right?)
   ...
> OK, I think I understand what you're after.  The code for an iterator
> expression has to create a generator function behind the scenes, and
> call it.  For example:

Then if I am to be able to plug it into ReiterableIterator or some such
mechanism, I need to be able to get at said generator function in order
to stash it away (and call it again), right?   Hmmm, maybe an iterator
built by a generator could keep a reference to the generator it's a
child of... but that still wouldn't give the args to call it with, darn... and
i doubt it makes sense to burden every generator-made iterator with
all those thingies, for the one-in-N need to possibly reiterate on it...

>   def gen(seq):
>       for x in seq:
> 	  yield f(x)
>   class Helper:
>       def __init__(seq):
>           self.seq = seq
>       def __iter__(self):
>           return gen(self.seq)
>   A = Helper(S)
>
> Then every time you use iter(A) gen() will be called with the saved
> value of S as argument.

Yes, that would let ReiterableIterator(iter, A) work, of course.

> > I suppose technically, this means the itercomp doesn't return an
> > iterator, but an iterable, which I suppose could be confusing if you
> > try to call its 'next()' method.  But then, it could have a next()
> > method that raises an error saying "call 'iter()' on me first".
>
> I don't mind that so much, but I don't think all the extra machinery
> is worth it; the compiler generally can't tell if it is needed so it
> has to produce the reiterable code every time.  If you *want* to
> have an iterable instead of an iterator, it's usually easy enough do
> (especially given knowledge about the type of S).

Yeah, that seems sensible to me.

> [Alex again]
>
> > There ARE other features I'd REALLY have liked to get from iterators
> > in some applications.
> >
> > A "snapshot" -- providing me two iterators, the original one and
> > another, which will step independently over the same sequence of
> > items -- would have been really handy at times.  And a "step back"
   ...
> > disturbed); but not knowing the abilities of the underlying iterator
> > would mean these wrappers would often duplicate functionality
> > needlessly.
>
> I don't see how it can be done without an explicit request for such a
> wrapper in the calling code.  If the underlying iterator is ephemeral
> (is not reiterable) the snapshotter has to save a copy of every item,
> and that would defeat the purpose of iterators if it was done
> automatically.  Or am I misunderstanding?

No, you're not.  But, if the need to snapshot (or reiterate, very different
thing) was deemed important (and I have my doubts if either of them
IS important enough -- I suspect snapshot perhaps, reiterable not, but
I don't _know_), we COULD have those iterators which "know how to
snapshot themselves" expose a .snapshot or __snapshot__ method.
Then a function make_a_snapshottable(it) [the names are sucky, sorry,
bear with me] would return it if that method was available, otherwise 
the big bad wrapper around it.

Basically, by exposing suitable methods an iterator could "make its
abilities know" to functions that may or may not need to wrap it in
order to achieve certain semantics -- so the functions can build
only those wrappers which are truly indispensable for the purpose.
Roughly the usual "protocol" approach -- functions use an object's
ability IF that object exposes methods providing that ability, and
otherwise fake it on their own.

> I'm not sure what you are suggesting here.  Are you proposing that
> *some* iterators (those which can be snapshotted cheaply) sprout a new
> snapshot() method?

If snapshottability (eek!) is important enough, yes, though __snapshot__
might perhaps be more traditional (but for iterators we do have the
precedent of method next without __underscores__).

> > As I said I do have use cases for all of these.  Simplest is the
> > ability to push back the last item obtained by next, since a frequent

Yeah, that's really easy to provide by a lightweight wrapper, which
was my not-so-well-clarified intended point.

> This definitely sounds like you'd want to create an explicit wrapper

Absolutely.

> Perhaps a snapshottable iterator could also have a backup() method
> (which would decrement self.i in your first example) or a prev()
> method (which would return self.sequence[self.i] and decrement
> self.i).

It seems to me that the ability to back up and that of snapshotting
are somewhat independent.

> > A "snapshot" would be useful whenever more than one pass on a
> > sequence _or part of it_ is needed (more useful than a "restart"
> > because of the "part of it" provision).  And a decent wrapper for it
> > is a bear...
>
> Such wrappers for specific container types (or maybe just one for
> sequences) could be in a standard library module.  Is more needed?

I think that if it's worth providing a wrapper it's also worth having
those iterators that don't need the wrapper (because they already
intrinsically have the needed ability) sprout the relevant method or
special method; "factory functions" provided with the wrappers
could then just return the already-satisfactory iterator, or a wrapper
built around it, depending.

Problem is, I'm NOT sure if "it's worth providing a wrapper" in
each of these cases.  snapshottingability (:-) is the one case where,
if I had to decide myself right now, I'd say "go for it"... but that may
be just because it's the one case for which I happened to stumble
on some use cases in production (apart from "undoing", which isn't
too bad to handle in other ways anyway).

Alex