[Python-checkins] cpython (merge 3.2 -> default): Clarify concatenation behaviour of immutable strings, and remove explicit

antoine.pitrou python-checkins at python.org
Fri Nov 25 16:39:26 CET 2011


http://hg.python.org/cpython/rev/905b6f1eca74
changeset:   73736:905b6f1eca74
parent:      73734:db8a14e02342
parent:      73735:2d6f0e2fe034
user:        Antoine Pitrou <solipsis at pitrou.net>
date:        Fri Nov 25 16:34:23 2011 +0100
summary:
  Clarify concatenation behaviour of immutable strings, and remove explicit
mention of the CPython optimization hack.

files:
  Doc/faq/programming.rst  |  26 ++++++++++++++++++++++++++
  Doc/library/stdtypes.rst |  21 ++++++++++++---------
  2 files changed, 38 insertions(+), 9 deletions(-)


diff --git a/Doc/faq/programming.rst b/Doc/faq/programming.rst
--- a/Doc/faq/programming.rst
+++ b/Doc/faq/programming.rst
@@ -989,6 +989,32 @@
 See the :ref:`unicode-howto`.
 
 
+What is the most efficient way to concatenate many strings together?
+--------------------------------------------------------------------
+
+:class:`str` and :class:`bytes` objects are immutable, therefore concatenating
+many strings together is inefficient as each concatenation creates a new
+object.  In the general case, the total runtime cost is quadratic in the
+total string length.
+
+To accumulate many :class:`str` objects, the recommended idiom is to place
+them into a list and call :meth:`str.join` at the end::
+
+   chunks = []
+   for s in my_strings:
+       chunks.append(s)
+   result = ''.join(chunks)
+
+(another reasonably efficient idiom is to use :class:`io.StringIO`)
+
+To accumulate many :class:`bytes` objects, the recommended idiom is to extend
+a :class:`bytearray` object using in-place concatenation (the ``+=`` operator)::
+
+   result = bytearray()
+   for b in my_bytes_objects:
+       result += b
+
+
 Sequences (Tuples/Lists)
 ========================
 
diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst
--- a/Doc/library/stdtypes.rst
+++ b/Doc/library/stdtypes.rst
@@ -968,15 +968,18 @@
    If *k* is ``None``, it is treated like ``1``.
 
 (6)
-   .. impl-detail::
-
-      If *s* and *t* are both strings, some Python implementations such as
-      CPython can usually perform an in-place optimization for assignments of
-      the form ``s = s + t`` or ``s += t``.  When applicable, this optimization
-      makes quadratic run-time much less likely.  This optimization is both
-      version and implementation dependent.  For performance sensitive code, it
-      is preferable to use the :meth:`str.join` method which assures consistent
-      linear concatenation performance across versions and implementations.
+   Concatenating immutable strings always results in a new object.  This means
+   that building up a string by repeated concatenation will have a quadratic
+   runtime cost in the total string length.  To get a linear runtime cost,
+   you must switch to one of the alternatives below:
+
+   * if concatenating :class:`str` objects, you can build a list and use
+     :meth:`str.join` at the end;
+
+   * if concatenating :class:`bytes` objects, you can similarly use
+     :meth:`bytes.join`, or you can do in-place concatenation with a
+     :class:`bytearray` object.  :class:`bytearray` objects are mutable and
+     have an efficient overallocation mechanism.
 
 
 .. _string-methods:

-- 
Repository URL: http://hg.python.org/cpython


More information about the Python-checkins mailing list