[Python-ideas] Adding a new function "zip_flat" to itertools (Re: Rewriting the "roundrobin" recipe in the itertools documentation)

Terry Reedy tjreedy at udel.edu
Mon Nov 20 12:15:50 EST 2017


On 11/20/2017 11:08 AM, Steven D'Aprano wrote:

> Please don't make claims about correctness and efficiency without
> testing the code first. The second suggestion given there, using deque,
> is *not* correct as provided, as it fails to work with iterables. It
> requires the caller to pass only iterators, unlike the existing
> roundrobin recipe which accepts any iterable.
> 
> Nor is it more efficient, at least on my machine -- in fact the
> opposite, it is the worst performing of the four recipes I've tried:
> 
> - the current recipe from the itertools docs;
> - your re-write, using zip_longest;
> - Terry's version;
> - and the one from stackoverflow.
> 
> I've attached my test code, in case you want to play around with it.
> Apologies in advance for any bugs in the test code (its 2 in the
> morning here and I've had a long day).
> 
> According to my testing, on my computer using Python 3.5, Terry's code
> is by far the fastest in all three separate test cases, but that
> probably shouldn't count since it's buggy (it truncates the results and
> bails out early under some circumstances). Out of the implementations
> that don't truncate, the existing recipe is by far the fastest.
> 
> Terry, if you're reading this, try:
> 
> list(roundrobin('A', 'B', 'CDE'))

> Your version truncates the results to A B C instead of A B C D E as the
> itertools recipe gives.

This is due to an off-by-1 error which I corrected 3 hours later in a 
follow-up post, repeated below.
---

Correct off-by-one error.  I should have tested with an edge case such as
print(list(roundrobin('ABC', '')))

 > The following combines 3 statements into one for statement.
 >
 > def roundrobin(*iterables):
 >      "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
 >      nexts = cycle(iter(it).__next__ for it in iterables)
 >      for reduced_len in reversed(range(1, len(iterables))):

Make that 0 rather than 1 for start value.

 >          try:
 >              for next in nexts:
 >                  yield next()
 >          except StopIteration:
 >              nexts = cycle(islice(nexts, reduced_len))

A slightly clearer, slightly less efficient alternative would be

def roundrobin(*iterables):
     "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
     nexts = cycle(iter(it).__next__ for it in iterables)
     for current_len in reversed(range(1, len(iterables)+1)):
         try:
             for next in nexts:
                 yield next()
         except StopIteration:
             nexts = cycle(islice(nexts, current_len - 1))

-- 
Terry Jan Reedy



More information about the Python-ideas mailing list