Split iterator into multiple streams

Peter Otten __peter__ at web.de
Sat Nov 6 05:25:42 EDT 2010


Steven D'Aprano wrote:

> Suppose I have an iterator that yields tuples of N items (a, b, ... n).
> 
> I want to split this into N independent iterators:
> 
> iter1 -> a, a2, a3, ...
> iter2 -> b, b2, b3, ...
> ...
> iterN -> n, n2, n3, ...
> 
> The iterator may be infinite, or at least too big to collect in a list.
> 
> My first attempt was this:
> 
> 
> def split(iterable, n):
>     iterators = []
>     for i, iterator in enumerate(itertools.tee(iterable, n)):
>         iterators.append((t[i] for t in iterator))
>     return tuple(iterators)
> 
> But it doesn't work, as all the iterators see the same values:
> 
>>>> data = [(1,2,3), (4,5,6), (7,8,9)]
>>>> a, b, c = split(data, 3)
>>>> list(a), list(b), list(c)
> ([3, 6, 9], [3, 6, 9], [3, 6, 9])
> 
> 
> I tried changing the t[i] to use operator.itergetter instead, but no
> luck. Finally I got this:
> 
> def split(iterable, n):
>     iterators = []
>     for i, iterator in enumerate(itertools.tee(iterable, n)):
>         f = lambda it, i=i: (t[i] for t in it)
>         iterators.append(f(iterator))
>     return tuple(iterators)
> 
> which seems to work:
> 
>>>> data = [(1,2,3), (4,5,6), (7,8,9)]
>>>> a, b, c = split(data, 3)
>>>> list(a), list(b), list(c)
> ([1, 4, 7], [2, 5, 8], [3, 6, 9])
> 
> 
> 
> 
> Is this the right approach, or have I missed something obvious?

Here's how to do it with operator.itemgetter():

>>> from itertools import *
>>> from operator import itemgetter
>>> data = [(1,2,3), (4,5,6), (7,8,9)]
>>> abc = [imap(itemgetter(i), t) for i, t in enumerate(tee(data, 3))]
>>> map(list, abc)
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]

I'd say the improvement is marginal. If you want to go fancy you can 
calculate n:

>>> def split(items, n=None):
...     if n is None:
...             items = iter(items)
...             first = next(items)
...             n = len(first)
...             items = chain((first,), items)
...     return [imap(itemgetter(i), t) for i, t in enumerate(tee(items, n))]
...
>>> map(list, split([(1,2,3), (4,5,6), (7,8,9)]))
[[1, 4, 7], [2, 5, 8], [3, 6, 9]]

Peter



More information about the Python-list mailing list