Would like some thoughts on a grouped iterator.
Peter Otten
__peter__ at web.de
Mon Sep 5 06:46:57 EDT 2016
Jussi Piitulainen wrote:
> Antoon Pardon writes:
>
>> I need an interator that takes an already existing iterator and
>> divides it into subiterators of items belonging together.
>>
>> For instance take the following class, wich would check whether
>> the argument is greater or equal to the previous argument.
>>
>> class upchecker:
>> def __init__(self):
>> self.prev = None
>> def __call__(self, arg):
>> if self.last is None:
>> self.prev = arg
>> return True
>> elif arg >= self.last:
>> self.prev = arg
>> return True
>> else:
>> self.prev = arg
>> return False
>>
>> So the iterator I need --- I call it grouped --- in combination with
>> the above class would be used someting like:
>>
>> for itr in grouped([8, 10, 13, 11, 2, 17, 5, 12, 7, 14, 4, 6, 15, 16, 19,
>> 9, 0, 1, 3, 18], upchecker()):
>> print list(itr)
>>
>> and the result would be:
>>
>> [8, 10, 13]
>> [11]
>> [2, 17]
>> [5, 12]
>> [7, 14]
>> [4, 6, 15, 16, 19]
>> [9]
>> [0, 1, 3, 18]
>>
>> Anyone an idea how I best tackle this?
>
> Perhaps something like this when building from scratch (not wrapping
> itertools.groupby). The inner grouper needs to communicate to the outer
> grouper whether it ran out of this group but it obtained a next item, or
> it ran out of items altogether.
>
> Your design allows inclusion conditions that depend on more than just
> the previous item in the group. This doesn't. I think itertools.groupby
> may raise an error if the caller didn't consumer a group before stepping
> to a new group. This doesn't. I'm not sure that itertools.groupby does
> either, and I'm too lazy to check.
>
> def gps(source, belong):
> def gp():
> nonlocal prev, more
> keep = True
> while keep:
> yield prev
> try:
> this = next(source)
> except StopIteration:
> more = False
> raise
> prev, keep = this, belong(prev, this)
> source = iter(source)
> prev = next(source)
> more = True
> while more:
> yield gp()
>
> from operator import eq, lt, gt
> for data in ([], [3], [3,1], [3,1,4], [3,1,4,1,5,9,2,6]):
> for tag, op in (('=', eq), ('<', lt), ('>', gt)):
> print(tag, data, '=>', [list(g) for g in gps(data, op)])
As usual I couldn't stop and came up with something very similar:
def grouped(items, check):
items = iter(items)
buf = next(items)
more = True
def group():
nonlocal buf, more
for item in items:
yield buf
prev = buf
buf = item
if not check(prev, item):
break
else:
yield buf
more = False
while more:
g = group()
yield g
for _ in g: pass
if __name__ == "__main__":
def upchecker(a, b):
return a < b
items = [
8, 10, 13, 11, 2, 17, 5, 12, 7, 14, 4, 6, 15, 16, 19, 9, 0, 1, 3, 18
]
for itr in grouped(items, upchecker):
print(list(itr))
The one thing I think you should adopt from this is that the current group
is consumed before yielding the next.
More information about the Python-list
mailing list