Paul Sokolovsky wrote:
> Roughly speaking, the answer would be about the same in idea as answers
> to the following questions:
> [snip]

I would say the difference between this proposal so far and the ones listed are that they emphasized concrete, real-world examples from existing code either in the stdlib or "out in the wild", showing clear before and after benefits of the proposed syntax. It may not seem necessary to the person proposing the feature and it does take some time to research, but it creates a drastically stronger argument for the new feature. The code examples I've seen so far in the proposal have been mostly abstract or simple toy examples. To get a general idea, I'd recommend looking over the examples in their respective PEPs, and then try to do something similar in your own arguments.

> The scholasm of "there's only one way to do it" is getting old for this
> language. Have you already finished explaining everyone why we needed
> assignment expressions, and why Python originally had % as a formatting
> operator, and some people swear to keep not needing anything else?

While I agree that it's sometimes okay to go outside the strict bounds of "only one way to do it", there needs to be adequate justification for doing so which provides a demonstrable benefit in real-world code. So the default should be just having one way, unless we have a very strong reason to consider adding an alternative. This was the case for the features you mentioned above.

> Please let people learn
> computer science inside Python, not learn bag of tricks to then escape
> in awe and make up haikus along the lines of:
>
> A language, originally for kids,
> Now for grown-up noobs.

Considering the current widespread usage of Python in the software development industry and others, characterizing it as a language for "grown-up noobs" seems rather disingenuous (even if partially in jest). We emphasize readability and beginner-friendliness, but Python is very far from beginner-only and I don't think it's even reasonable to say that it's going in that direction. In some ways, it simplifies operations that would otherwise be more complicated, but that's largely the point of a high-level language: abstracting the complex and low-level parts to focus more on the core business logic.

Also, while I can see that blindly relying on "str += part" can be sidestepping the underlying computer science to some degree, I find that appending the parts to a list and joining the elements is very conceptually similar to using a string buffer/builder; even if the syntax differs significantly from how other languages do it.

Regarding the proposal in general though, I actually like the main idea of having "StringBuffer/StringBuilder"-like behavior, assuming it provides substantial benefits to alternative Python implementations compared to ``""join()``. As someone who regularly uses other languages with something similar, I find the syntax to be appealing, but not strong enough on its own to justify a stdlib version (mainly since a wrapper would be very trivial to implement).

But, I'm against the idea of adding this to the existing StringIO class, largely for the reasons cited above, of it being outside of the scope of its intended use case. There's also a significant discoverability factor to consider. Based on the name and its use case in existing versions of Python, I don't think a substantial number of users will even consider using it for the purpose of building strings. As it stands, the only people who could end up benefiting from it would be the alternative implementations and their users, assuming they spend time actively searching for a way to build strings with reduced memory usage. So I would greatly prefer to see it as a separate class with a more informative name, even if it ends up being effectively implemented as a subset of StringIO with much of the same logic.

For example:

buf = StringBuilder() # feel free to bikeshed over the name
for part in parts:
    buf += part # in the __iadd__, it would presumably call something like buf.append() or buf.write()
return str(buf)

This would be highly similar to existing string building classes in other popular languages, such as Java and C#.

Also, on the point of memory usage: I'd very much like to see some real side-by-side comparisons of the ``''.join(parts)`` memory usage across Python implementations compared to ``StringIO.write()``. I some earlier in the thread, but the results were inaccurate since they relied entirely on ``sys.getsizeof()``, as mentioned earlier. IMO, having accurate memory benchmarks is critical to this proposal. As Chris Angelico mentioned, this can be observed through monitoring the before and after RSS (or equivalent on platforms without it). On Linux, I typically use something like this:

```
def show_rss():
    os.system(f"grep ^VmRSS /proc/{os.getpid()}/status")
```

With the above in mind, I'm currently +0 on the proposal. It seems like it might be a reasonable overall idea, but the arguments of its benefits need to be much more concrete before I'm convinced.

On Wed, Apr 1, 2020 at 5:45 PM Paul Sokolovsky <pmiscml@gmail.com> wrote:
Hello,

On Wed, 1 Apr 2020 10:01:06 +0100
Paul Moore <p.f.moore@gmail.com> wrote:

> On Wed, 1 Apr 2020 at 02:07, Steven D'Aprano <steve@pearwood.info>
> wrote:
> > Paul has not suggested making StringIO look and feel like a string.
> > Nobody is going to add 45+ string methods to StringIO. This is a
> > minimal extension to the StringIO class which will allow people to
> > improve their string building code with a minimal change. 
>
> Thanks for paring the proposal down to its bare bones, there's a lot
> of side questions being discussed here that are confusing things for
> me.
>
> With this in mind, and looking at the bare proposal, my immediate
> thought is who's going to use this new approach:
>
[]

>
> I hope this isn't going to trigger another digression, but it seems to
> me that the answer is "nobody, unless they are taught about it, or
> work it out for themselves[1]".

Roughly speaking, the answer would be about the same in idea as answers
to the following questions:

* Who'd be using assignment expressions? (2nd way to do assignment,
  whoa!)
* Who'd be using f-strings? (3rd (or more) way to do string formatting,
  bhoa!)
* Who'd be writing s = s.removeprefix("foo") instead of
  "if s.startswith("foo"): s = s[3:]" (PEP616)?
* Who'd be using binary operator @ ?
* Who'd be using using unary operator + ?


> My reasons for saying this are that it
> adds no value over the current idiom of building a list then using
> join(), so people who already write efficient code won't need to
> change. The people who *might* change to this are people currently
> writing
>
>     buf = ''
>     # repeated many times
>     buf += 'substring'
>
> Those people have presumably not yet learned about the (language
> independent) performance implication of repeated concatenation of
> immutable strings[2].

Ok, so we found the answers to all those questions - people who might
have a need to use, would use it. You definitely may argue of how many
people (in absolute and relative figures) would use it. Let the binary
operator @ and unary operator + be your aides in this task.


> At the moment, the
> message is relatively clear - "build a list and join it" (it's very
> rare that anyone suggests StringIO currently).

I don't know how much you mix with other Pythonistas, but word "clear"
is an exaggeration. From those who don't like it, the usual word is
"ugly", though I've seen more vivid epithets, like "repulsive":
https://mail.python.org/pipermail/python-list/2006-January/403480.html

More cool-headed guys like me just call it "complete wastage of memory".

> This proposal is
> presumably intended to make "use StringIO and +=" a more attractive
> alternative alternative proposal (because it avoids the need to
> rewrite all those += lines).

Aye.

> So we now find ourselves in the position
> of having *two* "recommended approaches" to addressing the performance
> issue with string concatenation.

The scholasm of "there's only one way to do it" is getting old for this
language. Have you already finished explaining everyone why we needed
assignment expressions, and why Python originally had % as a formatting
operator, and some people swear to keep not needing anything else?

What's worse, is that "there's only one way to do it" gets routinely
misinterpreted as "One True Way (tm)". And where Python is deficient to
other languages, there's rising small-scale exceptionalism along the
lines "we don't have it, and - we don't need it!". The issue is that
some (many) Python programmers use a lot of different languages, and
treat Python first of all as a generic programming language, not as a
bag of tricks of a particular implementation. And of course, there
never will be agreement between the one-true-way-tm vs
nice-generic-languages factions of the community.

> I'd contend that there's a benefit in having a single well-known idiom
> for fixing this issue when beginners hit it. Clarity of teaching, and
> less confusion for people who are learning that they need to address
> an issue that they weren't previously aware of.

Another acute and beaten topic in the community. Python is a melting pot
for diverse masses - beginners, greybeards, data scientists, scripting
kiddies, PhD, web programmers, etc. That's one of the greatest
achievements of Python, but also one of the pain points. I wonder how
many people escaped from Python to just not be haunted by that
"beginners" chanting.

Python is beginners-friendly language, period, can't change that.
Please don't bend it to be beginner-only. Please let people learn
computer science inside Python, not learn bag of tricks to then escape
in awe and make up haikus along the lines of:

A language, originally for kids,
Now for grown-up noobs.

(Actual haiku seen on Reddit, sorry, can't find a link now, reproduced
from memory, the original might have sounded better).

[]

--
Best regards,
 Paul                          mailto:pmiscml@gmail.com
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/P6S5K6X4ZBEBLRVPNOUNFUYW6WNSQUNS/
Code of Conduct: http://python.org/psf/codeofconduct/