[Python-Dev] generator comprehension syntax, was: accumulator display syntax

Alex Martelli aleaxit at yahoo.com
Sat Oct 18 05:20:45 EDT 2003


On Saturday 18 October 2003 05:57 am, Guido van Rossum wrote:
> > Which, by the way, brings up a question: should iterator comps be
> > reiterable?  I don't see any reason right now why they shouldn't be, and
> > can think of situations where reiterability would be useful.
>
> Oh, no.  Not reiterability again.  How can you promise something to be
> reiterable if you don't know whether the underlying iterator can be
> reiterated?  Keeping a hidden buffer would be a bad idea.

I agree it would be bad to have "black magic" performed by every iterator to
fulfil a contract that may or may not be useful to clients and might be costly
to fulfil.

IF "reiterability" is useful (and I'd need to see some use cases, because I
don't particularly recall pining for it in Python) it should be exposed as a
separate protocol that may or may not be offered by any given iterator
type.  E.g., the presence of a special method __reiter__ could indicate that
this iterator IS able to supply another iterator which retraces the same
steps from the start; and perhaps iter(xxx, reiterable=True) could strive
to provide a reiterable iterator for xxx, which might justify building one 
that keeps a hidden buffer as a last resort.  But first, I'd like use 
cases...

There ARE other features I'd REALLY have liked to get from iterators in
some applications.

A "snapshot" -- providing me two iterators, the original one and another,
which will step independently over the same sequence of items -- would
have been really handy at times.  And a "step back" facility ("undo" of
the last call to next) -- sometimes one level would suffice, sometimes not;
often I could have provided the item to be "pushed back" so the iterator
need not retain memory of it independently, but that wouldn't always be
handy.  Now any of these can be built as a wrapper over an existing
iterator, of course -- just like 'reiterability' could (and you could in fact
easily implement reiterability in terms of snapshotting, by just ensuring a
snapshot is taken at the start and further snapshotted but never disturbed);
but not knowing the abilities of the underlying iterator would mean these
wrappers would often duplicate functionality needlessly.

E.g.:

class snapshottable_sequence_iter(object):
    def __init__(self, sequence, i=0):
        self.sequence = sequence
        self.i = i
    def __iter__(self): return self
    def next(self):
        try: result = self.sequence[self.i]
        except IndexError: raise StopIteration
        self.i += 1
        return result
    def snapshot(self):
        return self.__class__(self.sequence, self.i)

Here, snapshotting is quite cheap, requiring just a new counter and
another reference to the same underlying sequence.  So would be
restarting and stepping back, directly implemented.  But if we need
to wrap a totally generic iterator to provide "snapshottability", we
inevitably end up keeping a list (or the like) of items so far seen from
one but not both 'independent' iterators obtained by a snapshot --
all potentially redundant storage, not to mention the possible coding
trickiness in maintaining that FIFO queue.

As I said I do have use cases for all of these.  Simplest is the
ability to push back the last item obtained by next, since a frequent
patter is:
    for item in iterator:
        if isok(item): process(item)
        else:
            # need to push item back onto iterator, then
            break
    else:
        # all items were OK, iterator exhausted, blah blah

    ...and later...

    for item in iterator:    # process some more items

Of course, as long as just a few levels of pushback are enough, THIS
one is an easy and light-weight wrapper to write:

class pushback_wrapper(object):
    def __init__(self, it):
        self.it = it
        self.pushed_back = []
    def __iter__(self): return self
    def next(self):
        try: return self.pushed_back.pop()
        except IndexError: return self.it.next()
    def pushback(self, item):
        self.pushed_back.append(item)


A "snapshot" would be useful whenever more than one pass on a
sequence _or part of it_ is needed (more useful than a "restart" because
of the "part of it" provision).  And a decent wrapper for it is a bear...


Alex




More information about the Python-Dev mailing list