[Raymond Hettinger, on lazy string cat]
... With little effort on the C side, there is an opportunity to be the first dynamic language with O(n) behavior for serial string concatenations -- one less thing to teach, one step towards scalability.
Perl has mutable strings, and appending is done in-place. I don't know whether they do all the memory mgmt tricks needed to guarantee worst-case linear behavior, but in practice stuff like this doesn't visibly slow down as time goes on (while it slows down dramatically in Python): $a = 'x'; while (1) { $a .= 'x'; if (length($a) % 1000 == 0) { print length($a), "\n"; } } Icon has immutable strings, but append-a-bunch-of-stuff-to-a-new-string is *often* linear-time there too. That's an implementation trick, relying on a compacting garbage collector so that new allocations are done just by incrementing the high-water mark at the end of the current free area. In practice it often turns out to be the case that catenating strings a & b into a finds a and b to be the "last two" memory regions allocated, and then the total cost of catenation is building a new (fixed-size) string descriptor, spanning the combined regions. So a compacting garbage collector, the internal use of string descriptors, and *not* sticking a NUL byte on the end of strings all conspire to make this possible. Java uses mutable string buffers under the covers, so avoids quadratic-time behavior too. The compiler conspires to make that happen. Python isn't unique in having a problem to solve here, but it's not the first to think of it either.
But wait, I think it won't fly at all unless you make ob_sval a pointer to separately allocated memory. Otherwise, how could _autojoin() possibly "fix" the string without allocating the memory for it?
PyString_Resize().
There isn't such a thing. The function you're thinking of is _PyString_Resize(), and it's in the private API for good reason: it's dangerous, *because* it can need to move the string object. Note that the implementation aborts unless the refcount on the string object is 1. It's intended to be used internally, and with no possibility of Python-level code "seeing" a string object between the time it's created and the time _PyString_Resize() returns.
... That being said, I didn't miss that you hate the idea, so I'll craft a recipe and drop it :-(
Because of the prohibition against relocating visible Python objects, this isn't so easy to worm around in Python. "".join(strings) is just a rite of Python passage.