[Python-ideas] itertools.chunks()

Oscar Benjamin oscar.j.benjamin at gmail.com
Tue Apr 9 21:19:50 CEST 2013


On 9 April 2013 17:46, Wolfgang Maier
<wolfgang.maier at biologie.uni-freiburg.de> wrote:
>
> Hi there,
> I have compared now your code for chunked (thanks a lot for sharing it!!)
> with Peter's strict_grouper using
> timeit.

Thanks. I also ran your code using different conditions like 100000
elements in chunks of 1024 because that's the sort of situation I was
interested in. strict_grouper is faster for the bulk of the iteration
but is not very fast at the end when the chunk size is large and the
last chunk has fill values.

> As a reminder, here's the strict_grouper code again:
>
> def strict_grouper(items, size, strict):
>     fillvalue = object()
>     args = [iter(items)]*size
>     chunks = zip_longest(*args, fillvalue=fillvalue)
>     prev = next(chunks)
>
>     for chunk in chunks:
>         yield prev
>         prev = chunk
>
>     if prev[-1] is fillvalue:
>         if strict:
>             raise ValueError
>         else:

This code is the cause of the slow end performance:

>             while prev[-1] is fillvalue:
>                 prev = prev[:-1]

I think where I was assuming large chunk sizes, Peter was assuming
small chunk sizes as this is quadratic in the chunk size. If you
change these lines to

    n = len(prev)-1
    while prev[n] is fillvalue:
        n -= 1
    del prev[n+1:]
    yield prev

then it will probably be as fast or faster than the one I posted in
pretty much all cases.


Oscar



More information about the Python-ideas mailing list