
On Mon, Mar 30, 2020 at 01:59:42PM -0700, Andrew Barnert via Python-ideas wrote: [...]
When you call getvalue() it then builds a Py_UCS4* representation that’s in this case 4x the size of the final string (since your string is pure ASCII and will be stored in UCS1, not UCS4). And then there’s the final string.
So, if this memory issue makes join unacceptable, it makes your optimization even more unacceptable.
You seem to be talking about a transient spike in memory usage, as the UCS4 string is built then disposed of. Paul seems to be talking about holding on to large numbers of substrings for long periods of time, possibly minutes or hours or even days in the case of a long running process. If StringIO.getvalue() builds an unnecessary UCS4 string, that's an obvious opportunity for optimization. Regardless of whether people use StringIO by calling the write() method or Paul's proposed `+=` this optimization might still be useful. In any case, throw in one emoji into your buffer, just one, and the whole point becomes moot. Whether you are using StringIO or list.append plus join, you still end up with a UCS4 string at the end. I don't understand the CPython implementation very well, I barely know any C at all, but your argument seems a bit dubious to me. Regardless of the implementation, if you accumulate N code points, it takes a minimum of N by the width of a code point to store that buffer. With a StringIO buffer, there is at least the opportunity to keep them all in a single buffer with minimal overhead: buf --> [CCCC] # four code points, each of 4 bytes in UCS4 With a list, you have significantly more overhead. For the sake of discussion, let's say you build it from four one-character strings. lst --> [PPPP] # four pointers to str objects Each pointer will take eight bytes on modern 64-bit systems, so that's already double the size of buf. Then there is the object overhead of the four strings, which is *particularly* acute for single ASCII chars. 50 bytes for a one byte ASCII char. So in the worst case, every char you add to your buffer takes 58 bytes in a list versus 4 for a StringIO that uses UCS4 internally. Whether StringIO takes advantage of that opportunity *right now* or not is, in a sense, irrelevent. It's an opportunity that lists don't have. Any (potential) inefficiency in StringIO could be improved, but it's baked into the design of lists that it *must* keep each string as a separate object. Of course there are only 128 unique ASCII characters, and interning reduces some of that overhead. But even in the best case where you are appending large strings there's always going to be more memory overhead in a list that a buffer has the opportunity to avoid. And if some specific implementation happens to have a particularly inefficient StringIO, that's a matter of quality of implementation and something for the users of that specific interpreter to take up with its maintainers. It's not a reason for use to reject Paul's proposal.
And thinking about portable code makes it even worse. Your code might be run under CPython and take even more memory, or it might be run under a different Python implementation where StringIO is not accelerated (where it’s just a TextIOWrapper around a BytesIO) and therefore be a whole lot slower instead.
So wait, let me see if I understand your argument: 1. CPython's string concatentation is absolutely fine, even though it is demonstrably slower on 11 out of the 12 interpreters that Paul tested. 2. The mere possibility of even a single hypothetical Python interpreter that has a slow and unoptimized StringIO buffer is enough to count against Paul's proposal. Is that correct, or have I missed some nuance to your defence of string concatenation and rejection of Paul's proposal?
So it has to be able to deal with both of those possibilities, not just one; code that uses the usual idiom, on the other hand, behaves pretty similarly on all implementations.
The "usual idiom" being discussed here is repeated string concatenation, which certainly does not behave similarly on all implementations. Unless, of course, you're referring to it performing *really poorly* on all implementations except CPython.
My whole concern is along 2 lines:
1. This StringBuilder class *could* be an existing io.StringIO. 2. By just adding __iadd__ operator to it.
No, it really couldn’t. The semantics are wrong (unless you want, say, universal newline handling in your string builder?),
Ah, now *that* is a good point.
it’s optimized for a different use case than string building,
It is? That's odd. The whole purpose of StringIO is to build strings. What use-case do you believe it is optimized for? -- Steven