[Python-ideas] Adding a new function "zip_flat" to itertools (Re: Rewriting the "roundrobin" recipe in the itertools documentation)
mertz at gnosis.cx
Mon Nov 20 13:04:31 EST 2017
After correcting Terry's off-by-one error, I get this on Python 3.6 (and
bumping NUMBER to 1000). In speed, Terry's is either very slightly faster
or very slightly slower than the recipe, depending on the data. I think
Terry's is more readable that roundrobin(), but still requires a little
thought compared to Bunslow's. However, using the builtin name 'next' as a
variable name seems like a bad choice to me (it doesn't break in this code,
but seems like bad practice, i'd choose some other variable name).
507-code % python rr_tester.py
testing with equal_data
testing with unequal_data
testing with extreme_data
508-code % python --version
Python 3.6.2 :: Anaconda custom (x86_64)
On Mon, Nov 20, 2017 at 9:15 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 11/20/2017 11:08 AM, Steven D'Aprano wrote:
> Please don't make claims about correctness and efficiency without
>> testing the code first. The second suggestion given there, using deque,
>> is *not* correct as provided, as it fails to work with iterables. It
>> requires the caller to pass only iterators, unlike the existing
>> roundrobin recipe which accepts any iterable.
>> Nor is it more efficient, at least on my machine -- in fact the
>> opposite, it is the worst performing of the four recipes I've tried:
>> - the current recipe from the itertools docs;
>> - your re-write, using zip_longest;
>> - Terry's version;
>> - and the one from stackoverflow.
>> I've attached my test code, in case you want to play around with it.
>> Apologies in advance for any bugs in the test code (its 2 in the
>> morning here and I've had a long day).
>> According to my testing, on my computer using Python 3.5, Terry's code
>> is by far the fastest in all three separate test cases, but that
>> probably shouldn't count since it's buggy (it truncates the results and
>> bails out early under some circumstances). Out of the implementations
>> that don't truncate, the existing recipe is by far the fastest.
>> Terry, if you're reading this, try:
>> list(roundrobin('A', 'B', 'CDE'))
> Your version truncates the results to A B C instead of A B C D E as the
>> itertools recipe gives.
> This is due to an off-by-1 error which I corrected 3 hours later in a
> follow-up post, repeated below.
> Correct off-by-one error. I should have tested with an edge case such as
> print(list(roundrobin('ABC', '')))
> > The following combines 3 statements into one for statement.
> > def roundrobin(*iterables):
> > "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
> > nexts = cycle(iter(it).__next__ for it in iterables)
> > for reduced_len in reversed(range(1, len(iterables))):
> Make that 0 rather than 1 for start value.
> > try:
> > for next in nexts:
> > yield next()
> > except StopIteration:
> > nexts = cycle(islice(nexts, reduced_len))
> A slightly clearer, slightly less efficient alternative would be
> def roundrobin(*iterables):
> "roundrobin('ABC', 'D', 'EF') --> A D E B F C"
> nexts = cycle(iter(it).__next__ for it in iterables)
> for current_len in reversed(range(1, len(iterables)+1)):
> for next in nexts:
> yield next()
> except StopIteration:
> nexts = cycle(islice(nexts, current_len - 1))
> Terry Jan Reedy
> Python-ideas mailing list
> Python-ideas at python.org
> Code of Conduct: http://python.org/psf/codeofconduct/
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons. Intellectual property is
to the 21st century what the slave trade was to the 16th.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-ideas