[Python-ideas] Adding a new function "zip_flat" to itertools (Re: Rewriting the "roundrobin" recipe in the itertools documentation)
Terry Reedy
tjreedy at udel.edu
Mon Nov 20 12:15:50 EST 2017
On 11/20/2017 11:08 AM, Steven D'Aprano wrote:
> Please don't make claims about correctness and efficiency without
> testing the code first. The second suggestion given there, using deque,
> is *not* correct as provided, as it fails to work with iterables. It
> requires the caller to pass only iterators, unlike the existing
> roundrobin recipe which accepts any iterable.
>
> Nor is it more efficient, at least on my machine -- in fact the
> opposite, it is the worst performing of the four recipes I've tried:
>
> - the current recipe from the itertools docs;
> - your re-write, using zip_longest;
> - Terry's version;
> - and the one from stackoverflow.
>
> I've attached my test code, in case you want to play around with it.
> Apologies in advance for any bugs in the test code (its 2 in the
> morning here and I've had a long day).
>
> According to my testing, on my computer using Python 3.5, Terry's code
> is by far the fastest in all three separate test cases, but that
> probably shouldn't count since it's buggy (it truncates the results and
> bails out early under some circumstances). Out of the implementations
> that don't truncate, the existing recipe is by far the fastest.
>
> Terry, if you're reading this, try:
>
> list(roundrobin('A', 'B', 'CDE'))
> Your version truncates the results to A B C instead of A B C D E as the
> itertools recipe gives.
This is due to an off-by-1 error which I corrected 3 hours later in a
follow-up post, repeated below.
---
Correct off-by-one error. I should have tested with an edge case such as
print(list(roundrobin('ABC', '')))
> The following combines 3 statements into one for statement.
>
> def roundrobin(*iterables):
> "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
> nexts = cycle(iter(it).__next__ for it in iterables)
> for reduced_len in reversed(range(1, len(iterables))):
Make that 0 rather than 1 for start value.
> try:
> for next in nexts:
> yield next()
> except StopIteration:
> nexts = cycle(islice(nexts, reduced_len))
A slightly clearer, slightly less efficient alternative would be
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
nexts = cycle(iter(it).__next__ for it in iterables)
for current_len in reversed(range(1, len(iterables)+1)):
try:
for next in nexts:
yield next()
except StopIteration:
nexts = cycle(islice(nexts, current_len - 1))
--
Terry Jan Reedy
More information about the Python-ideas
mailing list