[issue19801] Concatenating bytes is much slower than concatenating strings

Sworddragon report at bugs.python.org
Tue Dec 3 03:46:32 CET 2013


Sworddragon added the comment:

I have extended the benchmark a little and here are my new results:

concatenate_string()           : 0.037489
concatenate_bytes()            : 2.920202
concatenate_bytearray()        : 0.157311
concatenate_string_io()        : 0.035397
concatenate_bytes_io()         : 0.032835
concatenate_string_join()      : 0.170623
concatenate_string_and_encode(): 0.037280

- As we already know concatenating bytes is much slower then concatenating strings.
- concatenate_bytearray() shows that doing this with bytearrays is 5 times slower than concatenating strings. Also it will return a bytearray and I couldn't figure out how to convert it simply to a bytes object in this short time.
- Interestingly concatenate_string_io() shows that using a StringIO object is faster than concatenating strings directly.
- Even more interesting is that concatenate_bytes_io() shows that a BytesIO object is the fastest solution of all.
- Using .join in concatenate_string_join() shows that it is slow too.
- Curiously I couldn't test concatenate_bytes_join() as it will result in an exception. Searching the documentation resulted that I can't find a join method for bytes objects to look what is wrong.
- I have also tested in concatenate_string_and_encode() how fast it is to concatenate strings and then simply encode them. The performance impact compared to concatenating strings directly is low enough that the test couldn't measure it anymore.


Summary: BytesIO is the fastest solution but needs to import an extra library. Concatenating strings and then encode them seems to be the most practicable solution if io is not already imported.

But I'm wondering why Python can't simply have the string optimization on bytes too.

----------
Added file: http://bugs.python.org/file32945/test.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19801>
_______________________________________


More information about the Python-bugs-list mailing list