[Python-ideas] Proposal: itertools.batch

Oscar Benjamin oscar.j.benjamin at gmail.com
Tue Apr 9 23:49:29 CEST 2013


On 9 April 2013 21:21, Blagoj Petrushev <b.petrushev at gmail.com> wrote:
> Hello,
>
> I have an idea for a new function in the itertools module. I've been
> using this pattern quite a lot, so maybe someone else would think it
> is useful as well.
>
> The purpose is to split an iterable into batches with fixed size, and
> each yielded batch should be an iterator as well.

Are you aware of other threads on this list discussing groupers and
batchers and so on?

>
> def batch(iterable, batch_size):
>     exhausted = False
>     batch_range = range(batch_size)
>     while not exhausted:
>         def current():
>             nonlocal exhausted
>             for _ in batch_range:
>                 try:
>                     yield next(iterable)
>                 except StopIteration:
>                     exhausted = True
>         yield current()
>
> There are problems with this implementation:
> - the use of try/except is an overkill (the exception is raised only
> once, so maybe it's not that scarry)
> - it goes on forever if the batches are not actually consumed

What would you want it to do in this case?

> - it yields additional empty iterator if the original iterable's
> length is an exact multiple of batch_size.

This version solves the last issue and might be more efficient in general:

from operator import itemgetter
from itertools import islice, chain, tee, imap, izip

def batch(iterable, batch_size):
    done = []
    def stop():
        done.append(None)
        yield
    it1, it2 = tee(chain(iterable, stop()))
    next(it2) # Allow StopIteration to propagate
    iterator = imap(itemgetter(0), izip(it1, it2))
    while not done:
        yield islice(iterator, batch_size)


Oscar



More information about the Python-ideas mailing list