Would like some thoughts on a grouped iterator.

Jussi Piitulainen jussi.piitulainen at helsinki.fi
Mon Sep 5 06:24:40 EDT 2016


Antoon Pardon writes:

> I need an interator that takes an already existing iterator and
> divides it into subiterators of items belonging together.
>
> For instance take the following class, wich would check whether
> the argument is greater or equal to the previous argument.
>
> class upchecker:
>     def __init__(self):
> 	self.prev = None
>     def __call__(self, arg):
> 	if self.last is None:
>             self.prev = arg
>             return True
>         elif arg >= self.last:
>             self.prev = arg
>             return True
>         else:
>             self.prev = arg
> 	    return False
>
> So the iterator I need --- I call it grouped --- in combination with
> the above class would be used someting like:
>
> for itr in grouped([8, 10, 13, 11, 2, 17, 5, 12, 7, 14, 4, 6, 15, 16, 19, 9, 0, 1, 3, 18], upchecker()):
>     print list(itr)
>
> and the result would be:
>
> [8, 10, 13]
> [11]
> [2, 17]
> [5, 12]
> [7, 14]
> [4, 6, 15, 16, 19]
> [9]
> [0, 1, 3, 18]
>
> Anyone an idea how I best tackle this?

Perhaps something like this when building from scratch (not wrapping
itertools.groupby). The inner grouper needs to communicate to the outer
grouper whether it ran out of this group but it obtained a next item, or
it ran out of items altogether.

Your design allows inclusion conditions that depend on more than just
the previous item in the group. This doesn't. I think itertools.groupby
may raise an error if the caller didn't consumer a group before stepping
to a new group. This doesn't. I'm not sure that itertools.groupby does
either, and I'm too lazy to check.

def gps(source, belong):
    def gp():
        nonlocal prev, more
        keep = True
        while keep:
            yield prev
            try:
                this = next(source)
            except StopIteration:
                more = False
                raise
            prev, keep = this, belong(prev, this)
    source = iter(source)
    prev = next(source)
    more = True
    while more:
        yield gp()

from operator import eq, lt, gt
for data in ([], [3], [3,1], [3,1,4], [3,1,4,1,5,9,2,6]):
    for tag, op in (('=', eq), ('<', lt), ('>', gt)):
        print(tag, data, '=>', [list(g) for g in gps(data, op)])



More information about the Python-list mailing list