As others have pointed out, the OP started in a  bit of an oblique way, but it maybe come down to this:

There are some use-cases for a mutable string type. And one could certainly write one.

presto: here is one:

https://github.com/Daniil-Kost/mutable_strings

Which looks to me to be more a toy than anything, but maybe the author is seriously using it... (it does look like it has a bug indexing if there are  non-ascii)

And yet, as far as I know, there has never been one that was carefully written and optimized, which would be a bit of a trick, because of how Python strings handle Unicode. (it would have been a lot easier with Python2 :-) )

So why not?

1) As pointed out, high performance strings are key to a lot of coding, so Python's str is very baked-in to a LOT of code, and can't be duck-typed. I know that pretty much the only time I ever type check (as apposed to simple duck typing EAFTP) is for str. So if one were to make a mutable string type, you'd have to convert it to a string a lot in order to use most other libraries.

That being said, one could write a mutable string that mirrored' the cPython string types as much as possible, and it could be pretty efficient, even for making regular strings out of it.

2) Maybe it's really not that useful. Other than building up a long string with a bunch of small ones (which can be done fine with .join())  , I'm not sure I've had much of a use case -- it would buy you a tiny bit of performance for, say, altering strings in ways that don't change their length, but I doubt there's many (if any) applications that would see any meaningful benefit from that.

So I'd say it hasn't been done because (1) it's a lot of work and (2) it would be a bit of a pain to use, and not gain much at all.

A kind-of-related anecdote:

numpy arrays are mutable, but you can not change their length in place. So, similar with strings, if you want to build up an array with a lot of little pieces, then the best way is to put all the pieces in a list, and then make an array out of it when you are done.

I had a need to do that fairly often (reading data from files of unknown size) so I actually took the time to write an array that could be extended.

Turns out that:

1) it really wasn't much faster (than using a list) in the usual use-cases anyway :-)
2) it did save memory -- which only mattered for monster arrays, and I'd  likely need to do something smarter anyway in those cases.

I even took some time to write a Cython-optimized version, which only helped a little. I offered it up to the numpy community.

But in the end: no one expressed much interest. And I haven't used it myself for anything in a long while.

Moral of the story: not much point in a special class to do something that can already be done almost as well with the builtins.

-CHB






On Mon, Mar 30, 2020 at 2:06 PM Paul Sokolovsky <pmiscml@gmail.com> wrote:
Hello,

On Tue, 31 Mar 2020 07:40:01 +1100
Chris Angelico <rosuav@gmail.com> wrote:

> On Tue, Mar 31, 2020 at 7:04 AM Paul Sokolovsky <pmiscml@gmail.com>
> wrote:
> >     for i in range(50000):
> >         v = u"==%d==" % i
> >         # All individual strings will be kept in the list and
> >         # can't be GCed before teh final join.
> >         sz += sys.getsizeof(v)
> >         sb.append(v)
> >     s = "".join(sb)
> >     sz += sys.getsizeof(sb)
> >     sz += sys.getsizeof(s)
> >     print(sz)
> > 
>
> > ... about order of magnitude more memory ... 
>
> I suspect you may be multiply-counting some of your usage here. Rather
> than this, it would be more reliable to use the resident set size (on
> platforms where you can query that).

I may humbly suggest a different process too: get any hardware
board with MicroPython and see how much data you can collect in a
StringIO and in a list of strings. Well, you actually don't need a
dedicated hardware, just get a Linux or Windows version and run it
with a specific heap size using a -X heapsize= switch, e.g. -X
heapsize=100K.

Please don't stop there, we talk multiple implementations, try it on
CPython too. There must be a similar option there (because how
otherwise you can perform any memory-related testing!), I just forgot
which.

The results should be very apparent, and only forgotten option may
obfuscate it.

[]

--
Best regards,
 Paul                          mailto:pmiscml@gmail.com
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-leave@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/ZWKHUVQUMTUIGKXHGXG2AA3F35VUD2Y4/
Code of Conduct: http://python.org/psf/codeofconduct/


--
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython