On Wed, Feb 13, 2013 at 10:02 AM, Victor Stinner email@example.com wrote:
I added a _PyUnicodeWriter internal API to optimize str%args and str.format(args). It uses a buffer which is overallocated, so it's basically like CPython str += str optimization. I still don't know how efficient it is on Windows, since realloc() is slow on Windows (at least on old Windows versions).
We should add an official and public API to concatenate strings. I know that PyPy has already its own API. Example:
writer = UnicodeWriter() for item in data: writer += item # i guess that it's faster than writer.append(item) return str(writer) # or writer.getvalue() ?
I don't care of the exact implementation of UnicodeWriter, it just have to be as fast or faster than ''.join(data).
I don't remember if _PyUnicodeWriter is faster than StringIO or slower. I created an issue for that: http://bugs.python.org/issue15612
it's in __pypy__.builders (StringBuilder and UnicodeBuilder). The API does not really matter, as long as there is a way to preallocate certain size (which I don't think there is in StringIO for example). bytearray comes close but has a relatively inconvinient API and any pure-python bytearray wrapper will not be fast on CPython.