Wolfgang Maier wrote:
Dear all, the itertools documentation has the grouper() recipe, which returns consecutive tuples of a specified length n from an iterable. To do this, it uses zip_longest(). While this is an elegant and fast solution, my problem is that I sometimes don't want my tuples to be filled with a fillvalue (which happens if len(iterable) % n != 0), but I would prefer an error instead. This is important, for example, when iterating over the contents of a file and you want to make sure that it's not truncated. I was wondering whether itertools, in addition to the built-in zip() and zip_longest(), shouldn't provide something like zip_strict(), which would raise an Error, if its arguments aren't of equal length. zip_strict() could then be used in an alternative grouper() recipe.
By the way, right now, I am using the following workaround for this problem:
def iblock(iterable, bsize, strict=False): """Return consecutive lists of bsize items from an iterable.
If strict is True, raises a ValueError if the size of the last block in iterable is smaller than bsize. If strict is False, it returns the truncated list instead."""
it=iter(iterable) i=[it]*(bsize-1) while True: try: result=[next(it)] except StopIteration: # iterator exhausted, end the generator break for e in i: try: result.append(next(e)) except StopIteration: # iterator exhausted after returning at least one item, # but before returning bsize items if strict: raise ValueError("only %d value(s) left in iterator, expected %d" % (len(result),bsize)) else: pass yield result
, which works well, but is about 3-4 times slower than the grouper() recipe. If you have alternative, faster solutions that I wasn't thinking of, I'd be very interested to here about them.
Best, Wolfgang
A simple approach is def strict_grouper(items, size, strict): fillvalue = object() args = [iter(items)]*size chunks = zip_longest(*args, fillvalue=fillvalue) prev = next(chunks) for chunk in chunks: yield prev prev = chunk if prev[-1] is fillvalue: if strict: raise ValueError else: prev = prev[:prev.index(fillvalue)] yield prev If that's fast enough it might be a candidate for the recipes section. A partial solution I wrote a while a go is http://code.activestate.com/recipes/497006-zip_exc-a-lazy-zip-that-ensures-t...