Dear all: thank you for your replies and thoughts, most especially Steve and Terry. I am more-or-less new to contributing to Python, so I wasn't sure that the bug tracker was the best way to start -- I was looking for a sanity check and received exactly what I wanted :) Thanks to the feedback here, the bug tracker issue will be much cleaner from the start.

Regarding the meta discussion about my OP: yes it was long winded, detailed, pedantic and (to a certain extent) bombastic, but I was imaging the many possible responses to the suggestion (and suggested replacement) the I felt I should explain where I was coming from. Even though there was a lot of disagreement about how I described the current recipe as not very Pythonic, I'm very glad for all the perspectives and lessons I got from reading the ensuing discussions.

Now, having had some time to think, I've come up with further thoughts on the topic. In particular, I was going to create a new bug, and wrote several paragraphs on the topic summarizing this thread, and my thoughts. I'll paste those paragraphs here:

"""
To summarize, I found the current implementation of the "roundrobin" function difficult to understand, and proposed a much simpler solution that, while producing correct results, isn't quite "correct" in the sense that it glosses over a detail before "manually" correcting the detail at the end, and as such is prone to severe inefficiency in extreme cases. There were a smattering of comments from several people indicating that they liked the simpler recipe better, despite its performance drawbacks.

Terry Reedy proposed a slightly rewritten version of the current recipe, which implements the correct algorithm (without glossing over and manually correcting the details). Although I have since changed my perceptions of the original, now understanding how it works, the rewritten version from Terry was clearer enough that I was able to understand *it* where I could not previously understand the original. (My newfound understanding of the original is largely derived from making sense of the rewritten version, which properly clued me in to what cycle and islice actually do.

Either way, the current recipe can certainly be improved. Although I now find the original and its rewrite to be algorithmically clean and correct (even if the code and inline comments can be improved), the StackOverflow question (https://stackoverflow.com/questions/3678869/pythonic-way-to-combine-two-lists-in-an-alternating-fashion/) that originally got me thinking about the problem  demonstrates that the algorithmically clean way is *not* obvious at all to people who aren't very familiar with the itertools module -- which is the large majority of people who use Python (even if that's a very small fraction of the people reading this bug).  The second from top answer is the answer which references the recipe in the docs, but as my own first post to python-ideas demonstrates, the (large?) majority of python users aren't familiar enough with the itertools module to be able to understand that recipe, and I also believe many people were looking for one or two liners to use in their own code without a function call. Further confusion on the overall topic is the lack of a clear name -- "roundrobin", "alternate", "interleave", "merge", and variations and others.
"""

Having completed those, I found a roughly duplicate StackOverflow question to the one from my OP:
https://stackoverflow.com/questions/243865/how-do-i-merge-two-python-iterators

Besides emphasizing my points about not having even a clear name for the topic, a desire for one liners, mass confusion around the issue (especially regarding flattening zip [which terminates on the shortest input, a hidden gotcha], zip_longest [someone else found the same solution as me and others in this op), and all around failure to generate anything even resembling a consensus on the topic, I also found this answer:

https://stackoverflow.com/questions/243865/how-do-i-merge-two-python-iterators/40498526#40498526

which proposes a solution that is both more correct and efficient than the zip_longest-with-sentinels, and also noticeably more readable than either the original doc recipe or even Terry's cleaned up replacement of it.

Given this variety of problems with the issue, I now think that -- while updating the itertools recipe is certainly better than nothing -- the better thing to do might be to just add a new function to itertools called "zip_flat" which solves this problem. In addition to answering the stack overflow questions with ongoing debate about efficiency, correctness, and pythonicity (pythonicness?), it would also help to greatly clarify the naming issue as well. (Sidenote: whoever came up with "zip" as the name for the builtin was quite creative. It's a remarkably short and descriptive.)

What are the sentiments of readers here? If positive, I'll create an issue on BPO about zip_flat (rather than just improving the docs recipe).

(Sorry Steve for bringing this back to -ideas! At least this time I'm proposing an addition to the language itself! :)

Thanks for your consideration,
Bill



On Thu, Nov 16, 2017 at 5:06 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Nov 16, 2017 at 02:56:29PM -0500, Terry Reedy wrote:

> >3) using a while loop in code dealing with iterables,
>
> I agree that this is not necessary, and give a replacement below.

The OP isn't just saying that its unnecessary in this case, but that its
unPythonic to ever use a while loop in code dealing with iterables. I
disagree with that stronger statement.


> >def roundrobin(*iters):
> >     "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
> >     # Perhaps "flat_zip_nofill" is a better name, or something similar
> >     sentinel = object()
> >     for tup in it.zip_longest(*iters, fillvalue=sentinel):
> >         yield from (x for x in tup if x is not sentinel)
>
> This adds and then deletes grouping and fill values that are not wanted.
>  To me, this is an 'algorithm smell'.  One of the principles of
> algorithm design is to avoid unnecessary calculations.  For an edge case
> such as roundrobin(1000000*'a', ''), the above mostly does unnecessary work.

Its a recipe, not a tuned and optimized piece of production code.

And if you're going to criticise code on the basis of efficiency, then I
would hope you've actually profiled the code first. Because it isn't
clear to me at all that what you call "unnecessary work" is more
expensive than re-writing the recipe using a more complex algorithm with
calls to cycle and islice.

But I'm not here to nit-pick your recipe over the earlier ones.


[...]
> Since I have a competing 'improvement', I would hold off on a PR until
> Raymond Hettinger, the itertools author, comments.

Raise a doc issue on the tracker, and take the discussion there. I think
that this is too minor an issue to need long argument on the list.
Besides, it's not really on-topic as such -- it isn't about a change to
the language. Its about an implementation change to a recipe in the
docs.



--
Steve
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/