[Python-ideas] zip_strict() or similar in itertools ?

Wolfgang Maier wolfgang.maier at biologie.uni-freiburg.de
Thu Apr 4 12:33:31 CEST 2013


Dear all,
the itertools documentation has the grouper() recipe, which returns
consecutive tuples of a specified length n from an iterable. To do this, it
uses zip_longest(). While this is an elegant and fast solution, my problem
is that I sometimes don't want my tuples to be filled with a fillvalue
(which happens if len(iterable) % n != 0), but I would prefer an error
instead. This is important, for example, when iterating over the contents of
a file and you want to make sure that it's not truncated.
I was wondering whether itertools, in addition to the built-in zip() and
zip_longest(), shouldn't provide something like zip_strict(), which would
raise an Error, if its arguments aren't of equal length.
zip_strict() could then be used in an alternative grouper() recipe.

By the way, right now, I am using the following workaround for this problem:

def iblock(iterable, bsize, strict=False):
    """Return consecutive lists of bsize items from an iterable.

    If strict is True, raises a ValueError if the size of the last block
    in iterable is smaller than bsize. If strict is False, it returns the
    truncated list instead."""
    
    it=iter(iterable)
    i=[it]*(bsize-1)
    while True:
        try:
            result=[next(it)]
        except StopIteration:
            # iterator exhausted, end the generator
            break
        for e in i:
            try:
                result.append(next(e))
            except StopIteration:
                # iterator exhausted after returning at least one item,
                # but before returning bsize items
                if strict:
                    raise ValueError("only %d value(s) left in iterator,
expected %d" % (len(result),bsize))
                else:
                    pass
        yield result

, which works well, but is about 3-4 times slower than the grouper() recipe. 
If you have alternative, faster solutions that I wasn't thinking of, I'd be
very interested to here about them.

Best,
Wolfgang




More information about the Python-ideas mailing list