Splitting a string every 'n'

Mark McEahern marklists at mceahern.com
Tue Jul 9 12:24:26 EDT 2002


> > But the performance of this is hopeless for very long strings!
> > Presumable because there's too much list reallocation?  Can't Python
> > just optimise this by shuffling the start of the list forward?

Using generators here compares favorably with a smart while loop.  They have
the advantage of separating the iteration from the processing, so you can
actually reuse gen_substring since it allows you to iterate over the
n-length substrings:

#! /usr/bin/env python

from __future__ import generators
from time import clock

def gen_substring(s, n):
    i = 0
    end = len(s)
    while i <= end:
        j = i + n
        yield s[i:j]
        i = j

def do_gen(s, n):
    for sub in gen_substring(s, n):
        sub.upper()

def do_while_simple(s, n):
    while s:
        sub = s[:n]
        s = s[n:]
        sub.upper()

def do_while_smarter(s, n):
    i = 0
    end = len(s)
    while i <= end:
        j = i + n
        sub = s[i:j]
        i = j
        sub.upper()

def time_it(f, *args, **kwargs):
    start = clock()
    f(*args, **kwargs)
    end = clock()
    print "%s: %1.3f" % (f.func_name, end - start)

n = 4
size = 100000
s = 'a' * size

time_it(do_gen, s, n)
time_it(do_while_simple, s, n)
time_it(do_while_smarter, s, n)

-






More information about the Python-list mailing list