[Python-Dev] Optimized string concatenation

Tue Aug 3 10:54:01 CEST 2004

Hello,

The SF patch http://www.python.org/sf/980695 about making repeated string
concatenations efficient has been reviewed and is acceptable on technical
grounds.  This is about avoiding the quadratic behavior of

s = ''
for x in y:
  s += some_string(x)

This leaves open the policy questions:

* first, is that an implementation detail or a published feature?
  The question is important because the difference in performance is enormous
  -- we are not talking about 2x or even 10x faster but roughly Nx faster
  where N is the size of the input data set.

* if it is a published feature, what about Jython?

* The patch would encourage a coding style that gives program that essentially
  don't scale with Jython -- nor, for that matter, with 2.3 or older -- and
  worse, the programs would *appear* to work on Jython or 2.3 when tested with
  small or medium-sized data sets, but just appear to hang when run on larger
  data sets.  Obviously, this is problem that has always been here, but if we 
  fix it in 2.4 we can be sure that people will develop and test with 2.4,
  and less thoroughly on 2.3, and when they deploy on 2.3 platforms it will
  unexpectedly not scale.

* discussed on SF too is whether we should remove the 'a=a+b' acceleration
  from the patch, keeping only 'a+=b'; see the SF tracker.

This seems overkill, but should the acceleration be there but disabled by
default?

from __future__ import string_concatenate?

A bientôt,

Armin.