[Python-Dev] RFC: Add a new builtin strarray type to Python?

Alex Gaynor alex.gaynor at gmail.com
Sun Oct 2 18:34:03 CEST 2011


There are a number of issues that are being conflated by this thread.

1) Should str += str be fast. In my opinion, the answer is an obvious and
   resounding no. Strings are immutable, thus repeated string addition is
   O(n**2). This is a natural and obvious conclusion. Attempts to change this
   are only truly possible on CPython, and thus create a worse enviroment for
   other Pythons, as well as a quite misleading, as they'll be extremely
   brittle. It's worth noting that, to my knowledge, JVMs haven't attempted
   hacks like this.

2) Should we have a mutable string. Personally I think this question just
  misses the point. No one actually wants a mutable string, the closest thing
  anyone asks for is faster string building, which can be solved by a far more
  specialized thing (see (3)) without all the API hangups of "What methods
  mutate?", "Should it have every str method", or "Is it a dropin
  replacement?".

3) And, finally the question that prompted this enter thing. Can we have a
   better way of incremental string building than the current list + str.join
   method. Personally I think unless your interest is purely in getting the
   most possible speed out of Python, the current idiom is probably acceptable.
   That said, if you want to get the most possible speed, a StringBuilder in
   the vein PyPy offers is the only sane way. It's able to be faster because it
   has very little ways to interact with it, and once you're done it reuses
   it's buffer to create the Python level string object, which is to say
   there's no need to copy it at the end.

As I said, unless your interest is maximum performance, there's nothing wrong
with the current idiom, and we'd do well to educate our users, rather than have
more hacks.

Alex



More information about the Python-Dev mailing list