[Tutor] Joining all strings in stringList into one string
Steven D'Aprano
steve at pearwood.info
Thu Jun 7 01:06:31 CEST 2012
Prasad, Ramit wrote:
>> But more importantly, some years ago (Python 2.4, about 8 years ago?) the
>> Python developers found a really neat trick that they can do to optimize
>> string concatenation so it doesn't need to repeatedly copy characters over
>> and
>> over and over again. I won't go into details, but the thing is, this trick
>> works well enough that repeated concatenation is about as fast as the join
>> method MOST of the time.
>
> I would like to learn a bit more about the trick if you have a
> reference handy. I have no intention of using it, but it sounds
> interesting and might teach me more about Python internals.
In a nutshell, CPython identifies cases like:
mystr = mystr + otherstr
mystr += otherstr
where mystr is not used in any other place, and if possible, resizes mystr in
place and appends otherstr, rather than copying both to a new string object.
The "if possible" hides a lot of technical detail, which is why the
optimization can fail on some platforms while working on others. See this
painful discussion trying to debug httplib slowness:
http://mail.python.org/pipermail/python-dev/2009-August/091125.html
After many dead-ends and red herrings, somebody spotted the problem:
http://mail.python.org/pipermail/python-dev/2009-September/091582.html
ending with GvR admitting that it was an embarrassment that repeated string
concatenation had survived in the standard library for so long. The author
even knew it was slow because he put a comment warning about it!
Here is the middle of the discussion adding the optimization back in 2004:
http://mail.python.org/pipermail/python-dev/2004-August/046695.html
which talks about the possibility of other implementations doing something
similar. You can find the beginning of the discussion yourself :)
And here is a good description of the optimization itself:
http://utcc.utoronto.ca/~cks/space/blog/python/ExaminingStringConcatOpt
--
Steven
More information about the Tutor
mailing list