[Python-Dev] PATCH submitted: Speed up + for string Re: PATCHsubmitted: Speed up + for string concatenation, now as fast as "".join(x) idiom

Larry Hastings larry at hastings.org
Wed Oct 18 22:04:14 CEST 2006

Chetan Pandya wrote:
> The deallocation code needs to be robust for a complex tree - it is 
> currently not recursive, but needs to be, like the concatenation code.
It is already both those things.

Deallocation is definitely recursive.  See Objects/stringobject.c, 
function (*ahem*) recursive_dealloc.  That Py_DECREF() line is where it 
recurses into child string concatenation objects.

You might have been confused because it is *optimized* for the general 
case, where the tree only recurses down the left-hand side.  For the 
left-hand side it iterates, instead of recursing, which is both slightly 
faster and much more robust (unlikely to blow the stack).

> Rendering occurs if the string being concatenated is already a 
> concatenation object created by an earlier assignment.
Nope.  Rendering only occurs when somebody asks for the string's value, 
not when merely concatenating.  If you add nine strings together, the 
ninth one fails the "left side has room" test and creates a second object.

Try stepping through it.  Run Python interactively under the debugger.  
Let it get to the prompt.  Execute some expression like "print 3", just 
so the interpreter creates its concatenated encoding object (I get 
"encodings.cp437").  Now, in the debugger, put a breakpoint in the 
rendering code in recursiveConcatenate(), and another on the "op = 
(PyStringConcatenationObject *)PyObject_MALLOC()" line in 
string_concat.  Finally, go back to the Python console and concatenate 
nine strings with this code:
  x = ""
  for i in xrange(9):
    x += "a"
You won't hit any breakpoints for rendering, and you'll hit the string 
concatenation object malloc line twice.  (Note that for demonstration 
purposes, this code is more illustrative than running x = "a" + "b" ... 
+ "i" because the peephole optimizer makes a constant folding pass.  
It's mostly harmless, but for my code it does mean I create 
concatenation objects more often.)

In the interests of full disclosure, there is *one* scenario where pure 
string concatenation will cause it to render.  Rendering or deallocating 
a recursive object that's too deep would blow the program stack, so I 
limit recursion depth on the right seven slots of the recursion object.  
That's what the "right recursion depth" field is used for.  If you 
attempt to concatenate a string concatenation object that's already at 
the depth limit, it renders the deep object first.  The depth limit is 
2**14 right now.

You can force this to happen by prepending like crazy:
  x = ""
  for i in xrange(2**15):
    x = "a" + x

Since my code is careful to be only iterative when rendering and 
deallocating down the left-hand side of the tree, there is no depth 
limit for the left-hand side.

Step before you leap,


More information about the Python-Dev mailing list