Paul Sokolovsky wrote:
> Roughly speaking, the answer would be about the same in idea as answers
> to the following questions:
> [snip]
I would say the difference between this proposal so far and the ones listed are that they emphasized concrete, real-world examples from existing code either in the stdlib or "out in the wild", showing clear before and after benefits of the proposed syntax. It may not seem necessary to the person proposing the feature and it does take some time to research, but it creates a drastically stronger argument for the new feature. The code examples I've seen so far in the proposal have been mostly abstract or simple toy examples. To get a general idea, I'd recommend looking over the examples in their respective PEPs, and then try to do something similar in your own arguments.
> The scholasm of "there's only one way to do it" is getting old for this
> language. Have you already finished explaining everyone why we needed
> assignment expressions, and why Python originally had % as a formatting
> operator, and some people swear to keep not needing anything else?
While I agree that it's sometimes okay to go outside the strict bounds of "only one way to do it", there needs to be adequate justification for doing so which provides a demonstrable benefit in real-world code. So the default should be just having one way, unless we have a very strong reason to consider adding an alternative. This was the case for the features you mentioned above.
> Please let people learn
> computer science inside Python, not learn bag of tricks to then escape
> in awe and make up haikus along the lines of:
>
> A language, originally for kids,
> Now for grown-up noobs.
Considering the current widespread usage of Python in the software development industry and others, characterizing it as a language for "grown-up noobs" seems rather disingenuous (even if partially in jest). We emphasize readability and beginner-friendliness, but Python is very far from beginner-only and I don't think it's even reasonable to say that it's going in that direction. In some ways, it simplifies operations that would otherwise be more complicated, but that's largely the point of a high-level language: abstracting the complex and low-level parts to focus more on the core business logic.
Also, while I can see that blindly relying on "str += part" can be sidestepping the underlying computer science to some degree, I find that appending the parts to a list and joining the elements is very conceptually similar to using a string buffer/builder; even if the syntax differs significantly from how other languages do it.
Regarding the proposal in general though, I actually like the main idea of having "StringBuffer/StringBuilder"-like behavior, assuming it provides substantial benefits to alternative Python implementations compared to ``""join()``. As someone who regularly uses other languages with something similar, I find the syntax to be appealing, but not strong enough on its own to justify a stdlib version (mainly since a wrapper would be very trivial to implement).
But, I'm against the idea of adding this to the existing StringIO class, largely for the reasons cited above, of it being outside of the scope of its intended use case. There's also a significant discoverability factor to consider. Based on the name and its use case in existing versions of Python, I don't think a substantial number of users will even consider using it for the purpose of building strings. As it stands, the only people who could end up benefiting from it would be the alternative implementations and their users, assuming they spend time actively searching for a way to build strings with reduced memory usage. So I would greatly prefer to see it as a separate class with a more informative name, even if it ends up being effectively implemented as a subset of StringIO with much of the same logic.
For example:
buf = StringBuilder() # feel free to bikeshed over the name
for part in parts:
buf += part # in the __iadd__, it would presumably call something like buf.append() or buf.write()
return str(buf)
This would be highly similar to existing string building classes in other popular languages, such as Java and C#.
Also, on the point of memory usage: I'd very much like to see some real side-by-side comparisons of the ``''.join(parts)`` memory usage across Python implementations compared to ``StringIO.write()``. I some earlier in the thread, but the results were inaccurate since they relied entirely on ``sys.getsizeof()``, as mentioned earlier. IMO, having accurate memory benchmarks is critical to this proposal. As Chris Angelico mentioned, this can be observed through monitoring the before and after RSS (or equivalent on platforms without it). On Linux, I typically use something like this:
```
def show_rss():
os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
```
With the above in mind, I'm currently +0 on the proposal. It seems like it might be a reasonable overall idea, but the arguments of its benefits need to be much more concrete before I'm convinced.