# Sequence splitting

Scott David Daniels Scott.Daniels at Acm.Org
Fri Jul 3 13:47:15 EDT 2009

```Steven D'Aprano wrote:
> I've never needed such a split function, and I don't like the name, and
> the functionality isn't general enough. I'd prefer something which splits
> the input sequence into as many sublists as necessary, according to the
> output of the key function. Something like itertools.groupby(), except it
> runs through the entire sequence and collates all the elements with
> identical keys.
>
> splitby(range(10), lambda n: n%3)
> => [ (0, [0, 3, 6, 9]),
>      (1, [1, 4, 7]),
>      (2, [2, 5, 8]) ]
>
> Your split() would be nearly equivalent to this with a key function that
> returns a Boolean.

Well, here is my go at doing the original with iterators:

def splitter(source, test=bool):
a, b = itertools.tee((x, test(x)) for x in source)
return (data for data, decision in a if decision), (
data for data, decision in b if not decision)

This has the advantage that it can operate on infinite lists.  For
something like splitby for grouping, I seem to need to know the cases
up front:

def _make_gen(particular, src):
return (x for x, c in src if c == particular)

def splitby(source, cases, case):
'''Produce a dict of generators for case(el) for el in source'''
decided = itertools.tee(((x, case(x)) for x in source), len(cases))
return dict((c, _make_gen(c, src))
for c, src in zip(cases, decided))

example:

def classify(n):
'''Least prime factor of a few'''
for prime in [2, 3, 5, 7]:
if n % prime == 0:
return prime
return 0

for k,g in splitby(range(50), (2, 3, 5, 7, 0), classify).items():
print('%s: %s' % (k, list(g)))

0: [1, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47]
2: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24,
26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48]
3: [3, 9, 15, 21, 27, 33, 39, 45]
5: [5, 25, 35]
7: [7, 49]

--Scott David Daniels
Scott.Daniels at Acm.Org

```