[Python-Dev] PATCH submitted: Speed up + for string Re: PATCHsubmitted: Speed up + for string concatenation, now as fast as "".join(x) idiom
larry at hastings.org
Wed Oct 18 22:04:14 CEST 2006
Chetan Pandya wrote:
> The deallocation code needs to be robust for a complex tree - it is
> currently not recursive, but needs to be, like the concatenation code.
It is already both those things.
Deallocation is definitely recursive. See Objects/stringobject.c,
function (*ahem*) recursive_dealloc. That Py_DECREF() line is where it
recurses into child string concatenation objects.
You might have been confused because it is *optimized* for the general
case, where the tree only recurses down the left-hand side. For the
left-hand side it iterates, instead of recursing, which is both slightly
faster and much more robust (unlikely to blow the stack).
> Rendering occurs if the string being concatenated is already a
> concatenation object created by an earlier assignment.
Nope. Rendering only occurs when somebody asks for the string's value,
not when merely concatenating. If you add nine strings together, the
ninth one fails the "left side has room" test and creates a second object.
Try stepping through it. Run Python interactively under the debugger.
Let it get to the prompt. Execute some expression like "print 3", just
so the interpreter creates its concatenated encoding object (I get
"encodings.cp437"). Now, in the debugger, put a breakpoint in the
rendering code in recursiveConcatenate(), and another on the "op =
(PyStringConcatenationObject *)PyObject_MALLOC()" line in
string_concat. Finally, go back to the Python console and concatenate
nine strings with this code:
x = ""
for i in xrange(9):
x += "a"
You won't hit any breakpoints for rendering, and you'll hit the string
concatenation object malloc line twice. (Note that for demonstration
purposes, this code is more illustrative than running x = "a" + "b" ...
+ "i" because the peephole optimizer makes a constant folding pass.
It's mostly harmless, but for my code it does mean I create
concatenation objects more often.)
In the interests of full disclosure, there is *one* scenario where pure
string concatenation will cause it to render. Rendering or deallocating
a recursive object that's too deep would blow the program stack, so I
limit recursion depth on the right seven slots of the recursion object.
That's what the "right recursion depth" field is used for. If you
attempt to concatenate a string concatenation object that's already at
the depth limit, it renders the deep object first. The depth limit is
2**14 right now.
You can force this to happen by prepending like crazy:
x = ""
for i in xrange(2**15):
x = "a" + x
Since my code is careful to be only iterative when rendering and
deallocating down the left-hand side of the tree, there is no depth
limit for the left-hand side.
Step before you leap,
More information about the Python-Dev