[Tutor] Joining all strings in stringList into one string

Steven D'Aprano steve at pearwood.info
Thu Jun 7 01:06:31 CEST 2012


Prasad, Ramit wrote:
>> But more importantly, some years ago (Python 2.4, about 8 years ago?) the
>> Python developers found a really neat trick that they can do to optimize
>> string concatenation so it doesn't need to repeatedly copy characters over
>> and
>> over and over again. I won't go into details, but the thing is, this trick
>> works well enough that repeated concatenation is about as fast as the join
>> method MOST of the time.
> 
> I would like to learn a bit more about the trick if you have a 
> reference handy. I have no intention of using it, but it sounds 
> interesting and might teach me more about Python internals.

In a nutshell, CPython identifies cases like:

mystr = mystr + otherstr
mystr += otherstr

where mystr is not used in any other place, and if possible, resizes mystr in 
place and appends otherstr, rather than copying both to a new string object.

The "if possible" hides a lot of technical detail, which is why the 
optimization can fail on some platforms while working on others. See this 
painful discussion trying to debug httplib slowness:

http://mail.python.org/pipermail/python-dev/2009-August/091125.html

After many dead-ends and red herrings, somebody spotted the problem:

http://mail.python.org/pipermail/python-dev/2009-September/091582.html

ending with GvR admitting that it was an embarrassment that repeated string 
concatenation had survived in the standard library for so long. The author 
even knew it was slow because he put a comment warning about it!

Here is the middle of the discussion adding the optimization back in 2004:

http://mail.python.org/pipermail/python-dev/2004-August/046695.html

which talks about the possibility of other implementations doing something 
similar. You can find the beginning of the discussion yourself :)

And here is a good description of the optimization itself:

http://utcc.utoronto.ca/~cks/space/blog/python/ExaminingStringConcatOpt




-- 
Steven


More information about the Tutor mailing list