[Python-3000] Builtin iterator type

Tue Nov 14 17:49:35 CET 2006

On 11/14/06, Nick Coghlan <ncoghlan at gmail.com> wrote:

> George Sakkis wrote:
> > On 11/14/06, Fredrik Lundh <fredrik at pythonware.com> wrote:
> >
> >> BJ?rn Lindqvist wrote:
> >>
> >>> But why is both the dict and list protocol so fat then? Is it hard to
> >>> create your own dict or list-derived types in Python?
> >> don't confuse things like lists and dictionaries with things like
> >> sequences and mappings.  iterators and iterables belong to the second
> >> category.
> >
> > This doesn't answer my last question: why do we need itertools when we
> > can live without sequencetools, mappingtools, fileliketools, etc. ?
>
> Because the iterator protocol is simple enough to be easily composable - there
> are a wide variety of things that can be done with an object when all you know
> about it is that it is some form of iterable. The more assumptions you have to
> start making about the iterable, the less widely applicable the resulting
> operation will be.
>
> And I think the number one reason we don't see a compelling need for the extra
> tool libraries you describe is the fact that sequences, mappings and files are
> themselves all iterables.
>
> That said, there are actually a number of modules for working with sequences
> in the standard library, like bisect and heapq. They just aren't lumped into
> one place the way itertools and functools are.
>
> Having a rich method API vs having a narrow method API and duck-typed support
> functions is a design trade-off. In the case of sequences and mappings, the
> trade-off went towards a richer API because the de facto reference
> implementations were the builtin dict and list classes (which is why DictMixin
> and ListMixin are so useful when implementing your own containers). In the
> case of iterables and iterators, the trade-off went towards the narrow API so
> that the interface could be used in a wide variety of situations (lines in a
> file, records in a database, characters in a string, bytes from a serial port,
> frames in a bowling game, active players in a MMORPG, etc, etc, etc).

Given the overly negative reaction to a base iterator type, I withdraw
the part of my proposal that suggests this type as the base of all
iterators. Instead I propose a builtin Iter (or even better iter, if
there is no objection to change iter's current behavior) as an OO
replacement of the (functional) itertools API. In other words, instead
of:

from itertools import chain, islice, groupby
for k,sub in groupby(chain(islice(it1, 1, None), islice(it2, 5)),
key=str.lower):
    print k, list(sub)

you will write:

for k,sub in (Iter(it1)[1:] + Iter(it2)[:5]).groupby(str.lower):
    print k, list(sub)

it1, it2 can be arbitrary iterators, no restrictions imposed. Iter()
(or iter()) will just return a thin wrapper around them to provide
itertools functionality in a less verbose, more pythonic, way. We can
decide on the exact API of this wrapper later, but for now I'd like to
get a general impression of its acceptance chances.

Thoughts ?

George