[Python-ideas] Rewriting the "roundrobin" recipe in the itertools documentation

Thu Nov 16 08:56:21 EST 2017

For taking values alternately from a series of iterables, there's two
primary functions:

builtin.zip
itertools.zip_longest

zip of course stops when the shortest iterable ends. zip_longest is
generally a useful substitute for when you don't want the zip behavior, but
it fills extra values in the blanks rather than just ignoring a finished
iterator and moving on with the rest.

This latter most use case is at least somewhat common, according to this[1]
StackOverflow question (and other duplicates), in addition to the existence
of the `roundrobin` recipe[2] in the itertools docs. The recipe satisfies
this use case, and its code is repeated in the StackOverflow answer.

However, it is remarkably unpythonic, in my opinion, which is one thing
when such is necessary to achieve a goal, but for this functionality, such
is most definitely *not* necessary.  I'll paste the code here for quick
reference:

def roundrobin(*iterables):
    "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
    pending = len(iterables)
    nexts = cycle(iter(it).__next__ for it in iterables)
    while pending:
        try:
            for next in nexts:
                yield next()
        except StopIteration:
            pending -= 1
            nexts = cycle(islice(nexts, pending))

Things that strike me as unpythonic: 1) requiring the total number of input
iterables 2) making gratuitous use of `next`, 3) using a while loop in code
dealing with iterables, 4) combining loops, exceptions, and composed
itertools functions in non-obvious ways that make control flow difficult to
determine

Now, I get it, looking at the "roughly equivalent to" code for zip_longest
in the docs, there doesn't seem to be much way around it for generally
similar goals, and as I said above, unpythonic is fine when necessary
(practicality beats purity), but in this case, for being a "recipe" in the
itertools docs, it should *make use* of the zip_longest which already does
all the unpythonic stuff for you (though honestly I'm not convinced either
that the zip_longest code in the docs is the most possible pythonic-ness).
Instead, the following recipe (which I also submitted to the StackOverflow
question, and which is generally similar to several other later answers,
all remarking that they believe it's more pythonic) is much cleaner and
more suited to demonstrating the power of itertools to new developers than
the mess of a "recipe" pasted above.

def roundrobin(*iters):
    "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
    # Perhaps "flat_zip_nofill" is a better name, or something similar
    sentinel = object()
    for tup in it.zip_longest(*iters, fillvalue=sentinel):
        yield from (x for x in tup if x is not sentinel)

In particular, this is just an extremely thin wrapper around zip_longest,
whose primary purpose is to eliminate the otherwise-mandatory "fillvalues"
that zip_longest requires to produce uniform-length tuples. It's also an
excellent example of how to make best pythonic use of iterables in general,
and itertools in particular, and as such a much better implementation to be
demonstrated in documentation.

I would thus advocate that the former recipe is replaced with the latter
recipe, being much more pythonic, understandable, and useful for helping
new developers acquire the style of python. (Using the common linguistics
analogy: a dictionary and grammar for a spoken language may be enough to
communicate, but we rely on a large body of literature -- fiction,
research, poetry, etc -- as children to get that special flavor and most
expressive taste to the language. The stdlib is no Shakespeare, but it and
its docs still form an important part of the formative literature of the
Python language.)

I realize at the end of the day this is a pretty trivial and ultimately
meaningless nit to pick, but I've never contributed before and have a
variety of similar minor pain points in the docs/stdlib, and I'm trying to
gauge 1) how well this sort of minor QoL improvement is wanted, and 2) even
if it is wanted, am I going about it the right way. If the answers to both
of these questions are positive regarding this particular case, then I'll
look into making a BPO issue and pull request on GitHub, which IIUC is the
standard path for contributions.

Thank you for your consideration.

~~~~

[1]: https://stackoverflow.com/questions/3678869/
pythonic-way-to-combine-two-lists-in-an-alternating-fashion/

[2]: https://docs.python.org/3/library/itertools.html#itertools-recipes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20171116/99abdc6c/attachment-0001.html>