[Python-ideas] Proposal: itertools.batch
Oscar Benjamin
oscar.j.benjamin at gmail.com
Tue Apr 9 23:49:29 CEST 2013
On 9 April 2013 21:21, Blagoj Petrushev <b.petrushev at gmail.com> wrote:
> Hello,
>
> I have an idea for a new function in the itertools module. I've been
> using this pattern quite a lot, so maybe someone else would think it
> is useful as well.
>
> The purpose is to split an iterable into batches with fixed size, and
> each yielded batch should be an iterator as well.
Are you aware of other threads on this list discussing groupers and
batchers and so on?
>
> def batch(iterable, batch_size):
> exhausted = False
> batch_range = range(batch_size)
> while not exhausted:
> def current():
> nonlocal exhausted
> for _ in batch_range:
> try:
> yield next(iterable)
> except StopIteration:
> exhausted = True
> yield current()
>
> There are problems with this implementation:
> - the use of try/except is an overkill (the exception is raised only
> once, so maybe it's not that scarry)
> - it goes on forever if the batches are not actually consumed
What would you want it to do in this case?
> - it yields additional empty iterator if the original iterable's
> length is an exact multiple of batch_size.
This version solves the last issue and might be more efficient in general:
from operator import itemgetter
from itertools import islice, chain, tee, imap, izip
def batch(iterable, batch_size):
done = []
def stop():
done.append(None)
yield
it1, it2 = tee(chain(iterable, stop()))
next(it2) # Allow StopIteration to propagate
iterator = imap(itemgetter(0), izip(it1, it2))
while not done:
yield islice(iterator, batch_size)
Oscar
More information about the Python-ideas
mailing list