Splitting a sequence into pieces with identical elements

Chris Rebert clp2 at rebertia.com
Wed Aug 11 03:11:00 CEST 2010

On Tue, Aug 10, 2010 at 5:37 PM, candide <candide at free.invalid> wrote:
> Suppose you have a sequence s , a string  for say, for instance this one :
> spppammmmegggssss
> We want to split s into the following parts :
> ['s', 'ppp', 'a', 'mmmm', 'e', 'ggg', 'ssss']
> ie each part is a single repeated character word.
> What is the pythonic way to answer this question?

If you're doing an operation on an iterable, always leaf thru itertools first:

from itertools import groupby
def split_into_runs(seq):
    return ["".join(run) for letter, run in groupby(seq)]

If itertools didn't exist:

def split_into_runs(seq):
    if not seq: return []

    iterator = iter(seq)
    letter = next(iterator)
    count = 1
    words = []
    for c in iterator:
        if c == letter:
            count += 1
            word = letter * count
            letter = c
            count = 1
    return words


More information about the Python-list mailing list