fastest string building?

Preston Landers prestonlanders at my-deja.com
Tue Oct 19 14:48:51 EDT 1999


 I was recently wondering what the fastest way to work with strings
was.  For instance, let's say I have a string, and I want to add some
things to the end, or possibly insert them in the middle.  If I were to
repeat that thousands or hundreds of thousands of times, what would be
the most efficient syntax to do it?

To my knowledge, there are basically three ways to do it.

1: s = part1 + part2 + part3 + ... partn

2: s = string.join([part1, part2, partn], "")

3: s = "%s%s%s%s" % (part1, part2, part3, part4)

Am I missing anything?

My guess was that method #2 is fastest, followed by 3 and then #1 would
be (by far) the slowest.

I did a quick test (using the unix shell 'time' command) which more or
less confirmed my hypothesis.

Just now, however, I decided to do a more indepth test with the Python
profiler.  The results were surprising: method #1 above was fastest,
followed closely by #3, and #2 lagged way back.  So, according to this
test, s = part1 + part2 + partn seems to be the fastest way to build a
string out of parts.

Can someone look at my profiling script and see if I made any obvious
errors?  Is behavior of strings already well-known?

By the way, the new_profile module below is just the same as the
regular Python profile with a new profiling constant calibrated for my
machine.

thanks,

---Preston



print "This program demonstrates the relative efficiency of different
types of string construction methods in Python."

print "Method 1:   s = a + b"
print "Method 2:   s = string.join([a, b], '')"
print "Method 3:   s = '%s%s' % (a, b)"

import string, time

def method_1(a, b, c, d, e):
    return a + b + c + d + e

def method_2(a, b, c, d, e):
    return string.join([a, b, c, d, e], "")

def method_3(a, b, c, d, e):
    return "%s%s%s%s%s" % (a, b, c, d, e)

def wrapper(method):
##    a = str(time.time())
##    b = "constant"
    c_range = 50
    d_range = 50
    e_range = 50

    for c in range(c_range):
        for d in range(d_range):
            for e in range(e_range):
                # args = (a, b, str(c), str(d), str(e))
                args = ("a", "b", "c", "d", "e")
                apply(method, args)

import new_profile
import pstats

def __main__():
    print "Profiling method_1:"
    new_profile.run("wrapper(method_1)", "/tmp/m1.prof")
    p = pstats.Stats("/tmp/m1.prof")
    p.strip_dirs().sort_stats('time','cum', 'nfl').print_stats
().print_callees()

    print "Profiling method_2:"
    new_profile.run("wrapper(method_2)", "/tmp/m2.prof")
    p = pstats.Stats("/tmp/m2.prof")
    p.strip_dirs().sort_stats('time','cum', 'nfl').print_stats
().print_callees()

    print "Profiling method_3:"
    new_profile.run("wrapper(method_3)", "/tmp/m3.prof")
    p = pstats.Stats("/tmp/m3.prof")
    p.strip_dirs().sort_stats('time','cum', 'nfl').print_stats
().print_callees()

    print "Finished!"

__main__()

--
|| Preston Landers <prestonlanders at my-deja.com> ||


Sent via Deja.com http://www.deja.com/
Before you buy.




More information about the Python-list mailing list