PATCH: Speed up direct string concatenation by 20+%!

Larry Hastings larry at hastings.org
Tue Oct 3 01:35:06 EDT 2006


Fredrik Lundh wrote:
> You should also benchmark this against code that uses the ordinary
> append/join pattern.

Sorry, thought I had.  Of course, now that the patch is up on
Sourceforce you could download it and run all the benchmarks you like.

For all the benchmarks I ran below, the number listed is the best of
three runs.  Time was computed using sum(os.times()[:2]).

Running this under Python 2.5 release:
    x = []
    for i in xrange(10000000):
        x.append("a")
    y = "".join(x)
took 4421ms.

Running this under my patched Python:
    x = ""
    for i in xrange(10000000):
        x += "a"
    y = x[1]
took 4406ms.

I assert that my code makes + as fast as the old "".join([]) idiom.


> It's rather unlikely that something like this will ever be added to
> the 2.X series.  It's pretty unlikely for 3.X as well (GvR used a
> rope-like structure for ABC, and it was no fun for anyone), but it'll
> most likely be a lot easier to provide this as an option for 3.X.

I can't address the ABC implementation as I've never seen it.  But my
patch only touches four files. Two are obviously stringobject.c and .h.
The other two files, ceval.c and codeobject.c, only got one-line
changes.  My changes to PyStringObject are well-encapsulated; as long
as core / extension programmers continue to use PyString_AS_STRING() to
access the char * in a PyStringObject they will never notice the
difference.


John Machin wrote:
> try benchmarking this ... well "style" may not be the appropriate word

Running this under Python 2.5 release:
    x = []
    xappend = x.append
    for i in xrange(10000000):
        xappend("a")
    y = "".join(x)
took 3281ms.

Running this under my patched Python 2.5:
    x = ""
    xappend = x.__add__
    for i in xrange(10000000):
        xappend("a")
    y = "".join(x)
took 3343ms.


/larry/




More information about the Python-list mailing list