string substitutions

Gerson Kurz gerson.kurz at t-online.de
Sun Feb 24 00:51:46 EST 2002


On 23 Feb 2002 11:52:10 -0800, bobnotbob at byu.edu (Bob Roberts) wrote:

>What would be a good way to replace every one or more spaces (" ") in
>a string with just one space?  Or replace any number of newlines with
>just one?

Lets see. There were four solutions mentioned:

-------------(cut here)---------------
def test1(newstring):
    while newstring.find('  ') > -1:
        newstring = newstring.replace('  ', ' ')
    return newstring

def test2(newstring):
    return " ".join(filter(None,newstring.split(' ')))

def test3(newstring):
    return re.sub(' +', ' ', newstring)

def test4(newstring):
    return ' '.join(newstring.split())
-------------(cut here)---------------

Note that test4() does also split newlines (and tabs), this is why
test2() explicitly splits for *blanks only*. Given these, you can test
which one is fastest

-------------(cut here)---------------
def measure(count,func):
    start = time.clock()    
    for i in xrange(count):
        func("This  is a    test\n isn't it?")
    print "%s took %.2f for %d elements" % (str(func),time.clock() -
start, count)

for count in [pow(10,i) for i in range(5,7)]:
    for func in (test1, test2, test3, test4):
        measure(count,func)
    print
-------------(cut here)---------------

results in

test1 took 1.82 for 100000 elements
test2 took 2.12 for 100000 elements
test3 took 3.75 for 100000 elements
test4 took 1.38 for 100000 elements

test1 took 19.06 for 1000000 elements
test2 took 22.17 for 1000000 elements
test3 took 38.20 for 1000000 elements
test4 took 12.17 for 1000000 elements

So, the simple "while"-solution (test1) is actually the fastest if you
really only want to replace spaces (and not newlines and tabs, also).
I find that quite interesting because the other three solutions seem
so much more sophisticated.



More information about the Python-list mailing list