[Python-Dev] Optimize Unicode strings in Python 3.3

Victor Stinner victor.stinner at gmail.com
Wed May 30 00:44:05 CEST 2012


Hi,

>  * Use a Py_UCS4 buffer and then convert to the canonical form (ASCII,
> UCS1 or UCS2). Approach taken by io.StringIO. io.StringIO is not only
> used to write, but also to read and so a Py_UCS4 buffer is a good
> compromise.
>  * PyAccu API: optimized version of chunks=[]; for ...: ...
> chunks.append(text); return ''.join(chunks).
>  * Two steps: compute the length and maximum character of the output
> string, allocate the output string and then write characters. str%args
> was using it.
>  * Optimistic approach. Start with a ASCII buffer, enlarge and widen
> (to UCS2 and then UCS4) the buffer when new characters are written.
> Approach used by the UTF-8 decoder and by str%args since today.

I ran extensive benchmarks on these 4 methods for str%args and str.format(args).

The "two steps" method is not promising: parsing the format string
twice is slower than other methods.

The PyAccu API is faster than a Py_UCS4 buffer to concatenate a lot of
strings, but it is slower in many other cases.

I implemented the last method as the new internal "_PyUnicodeWriter"
API: resize / widen the string buffer when writing new characters. I
implemented more optimizations:
 * overallocate the buffer to limit the cost of realloc()
 * write characters directly in the buffer, avoid temporary buffers
when possible (it is possible in most cases)
 * disable overallocation when formating the last argument
 * don't copy by value but copy by reference if the result is just a
string (optimization already implemented indirectly in the PyAccu API)

The _PyUnicodeWriter is the fastest method: it gives a speed up of 30%
over the Py_UCS4 / PyAccu in general, and from 60% to 100% in some
specific cases!

I also compared str%args and str.format() with Python 2.7 (byte
strings), 3.2 (UTF-16 or UCS-4) and 3.3 (PEP 393): Python 3.3 is as
fast as Python 2.7 and sometimes faster! (Whereras Python 3.2 is 10 to
30% slower than Python 2 in general)

--

I wrote a tool to run benchmarks and to compare results:
https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py
https://bitbucket.org/haypo/misc/src/tip/python/bench_str.py

Run the benchmark:
./python benchmark.py --file=FILE script bench_str.py

Compare results:
./python benchmark.py compare_to FILE1 FILE2 FILE3 ...

--

Python 2.7 vs 3.2 vs 3.3:

http://bugs.python.org/file25685/REPORT_32BIT_2.7_3.2_writer
http://bugs.python.org/file25687/REPORT_64BIT_2.7_3.2_writer
http://bugs.python.org/file25757/report_windows7

Warning: For the Windows benchmark, Python 3.3 is compiled in 32 bits,
whereas 2.7 and 3.2 are compiled in 64 bits (formatting integers is
slower in 32 bits).

--

UCS4 vs PyAccu vs _PyUnicodeWriter:

http://bugs.python.org/file25686/REPORT_32BIT_3.3
http://bugs.python.org/file25688/REPORT_64BIT_3.3

Victor


More information about the Python-Dev mailing list