
On Wed, 1 Apr 2020 at 02:07, Steven D'Aprano <steve@pearwood.info> wrote:
Paul has not suggested making StringIO look and feel like a string. Nobody is going to add 45+ string methods to StringIO. This is a minimal extension to the StringIO class which will allow people to improve their string building code with a minimal change.
Thanks for paring the proposal down to its bare bones, there's a lot of side questions being discussed here that are confusing things for me. With this in mind, and looking at the bare proposal, my immediate thought is who's going to use this new approach: buf = StringIO() buf += 'substring' buf = buf.getvalue() I hope this isn't going to trigger another digression, but it seems to me that the answer is "nobody, unless they are taught about it, or work it out for themselves[1]". My reasons for saying this are that it adds no value over the current idiom of building a list then using join(), so people who already write efficient code won't need to change. The people who *might* change to this are people currently writing buf = '' # repeated many times buf += 'substring' Those people have presumably not yet learned about the (language independent) performance implication of repeated concatenation of immutable strings[2]. Ignoring CPython's optimisation for += on strings, as all that will do is allow them to survive longer without hitting the issues with this pattern, when they *do* find there's an issue, they will be looking for a better approach. At the moment, the message is relatively clear - "build a list and join it" (it's very rare that anyone suggests StringIO currently). This proposal is presumably intended to make "use StringIO and +=" a more attractive alternative alternative proposal (because it avoids the need to rewrite all those += lines). So we now find ourselves in the position of having *two* "recommended approaches" to addressing the performance issue with string concatenation. I'd contend that there's a benefit in having a single well-known idiom for fixing this issue when beginners hit it. Clarity of teaching, and less confusion for people who are learning that they need to address an issue that they weren't previously aware of. I further suggest that the benefits of the += syntax on StringIO (less change to existing code) are not sufficient to outweigh the benefits of having a single well-known "best practice" solution. So I'm -0.5 on this change (only 0.5, because it's a pretty trivial change, and not worth getting too worked up about). Paul [1] Or they have a vested interest in using the "string builder" pattern in Python, rather than using Python's native idioms. That's not an uncommon situation, but I don't think "helping people write <language X> in Python" is a good criterion for assessing language changes, in general. [2] Or they have, and know that it doesn't affect them, in which case they don't need to change anything.