[docs] [issue31270] Simplify documentation of itertools.zip_longest

Raphael Michel report at bugs.python.org
Thu Aug 24 12:03:33 EDT 2017


New submission from Raphael Michel:

The documentation given for itertools.zip_longest contains a "roughly equivalent" pure-python implementation of the function that is intended to help the user understand what zip_longest does on a functional level.

However, the given implementation is very complicated to read for newcomers and experienced Python programmers alike, as it uses a custom-defined exception for control flow handling, a nested function, a condition that always is true if any arguments are passed ("while iterators"), as well as two other non-trivial functions from itertools (chain and repeat).

For future reference, this is the currently given implementation:

    def zip_longest(*args, **kwds):
        # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
        fillvalue = kwds.get('fillvalue')
        iterators = [iter(it) for it in args]

        while True:
            exhausted = 0
            values = []

            for it in iterators:
                try:
                    values.append(next(it))
                except StopIteration:
                    values.append(fillvalue)
                    exhausted += 1

            if exhausted < len(args):
                yield tuple(values)
            else:
                break

This is way more complex than necessary to teach the concept of zip_longest. With this issue, I will submit a pull request with a new example implementation that seems to be the same level of "roughly equivalent" but is much easier to read, since it only uses two loops and now complicated flow 

    def zip_longest(*args, **kwds):
        # zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-
        fillvalue = kwds.get('fillvalue')
        iterators = [iter(it) for it in args]

        while True:
            exhausted = 0
            values = []

            for it in iterators:
                try:
                    values.append(next(it))
                except StopIteration:
                    values.append(fillvalue)
                    exhausted += 1

            if exhausted < len(args):
                yield tuple(values)
            else:
                break


Looking at the C code of the actual implementation, I don't see that any one of the two implementations is obviously "more equivalent". I'm unsure about performance -- I haven't tried them on that but I don't think that's the point of this learning implementation.

I ran all tests from Lib/test/test_itertools.py against both the old and the new implementation. The new implementation fails at 3 tests, while the old implementation failed at four. Two of the remaining failures are related to TypeErrors not being thrown on invalid input, one of them is related to pickling the resulting object. I believe all three of them are fine to ignore in this sample, as it is not relevant to the documentation purpose.

Therefore, I believe the documentation should be changed like suggested. I'd be happy for any feedback or further ideas to improve its readability!

----------
assignee: docs at python
components: Documentation
messages: 300788
nosy: docs at python, rami
priority: normal
severity: normal
status: open
title: Simplify documentation of itertools.zip_longest
type: enhancement
versions: Python 3.7

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue31270>
_______________________________________


More information about the docs mailing list