[Python-Dev] Optimize Unicode strings in Python 3.3
Victor Stinner
victor.stinner at gmail.com
Wed May 30 00:44:05 CEST 2012
Hi,
> * Use a Py_UCS4 buffer and then convert to the canonical form (ASCII,
> UCS1 or UCS2). Approach taken by io.StringIO. io.StringIO is not only
> used to write, but also to read and so a Py_UCS4 buffer is a good
> compromise.
> * PyAccu API: optimized version of chunks=[]; for ...: ...
> chunks.append(text); return ''.join(chunks).
> * Two steps: compute the length and maximum character of the output
> string, allocate the output string and then write characters. str%args
> was using it.
> * Optimistic approach. Start with a ASCII buffer, enlarge and widen
> (to UCS2 and then UCS4) the buffer when new characters are written.
> Approach used by the UTF-8 decoder and by str%args since today.
I ran extensive benchmarks on these 4 methods for str%args and str.format(args).
The "two steps" method is not promising: parsing the format string
twice is slower than other methods.
The PyAccu API is faster than a Py_UCS4 buffer to concatenate a lot of
strings, but it is slower in many other cases.
I implemented the last method as the new internal "_PyUnicodeWriter"
API: resize / widen the string buffer when writing new characters. I
implemented more optimizations:
* overallocate the buffer to limit the cost of realloc()
* write characters directly in the buffer, avoid temporary buffers
when possible (it is possible in most cases)
* disable overallocation when formating the last argument
* don't copy by value but copy by reference if the result is just a
string (optimization already implemented indirectly in the PyAccu API)
The _PyUnicodeWriter is the fastest method: it gives a speed up of 30%
over the Py_UCS4 / PyAccu in general, and from 60% to 100% in some
specific cases!
I also compared str%args and str.format() with Python 2.7 (byte
strings), 3.2 (UTF-16 or UCS-4) and 3.3 (PEP 393): Python 3.3 is as
fast as Python 2.7 and sometimes faster! (Whereras Python 3.2 is 10 to
30% slower than Python 2 in general)
--
I wrote a tool to run benchmarks and to compare results:
https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py
https://bitbucket.org/haypo/misc/src/tip/python/bench_str.py
Run the benchmark:
./python benchmark.py --file=FILE script bench_str.py
Compare results:
./python benchmark.py compare_to FILE1 FILE2 FILE3 ...
--
Python 2.7 vs 3.2 vs 3.3:
http://bugs.python.org/file25685/REPORT_32BIT_2.7_3.2_writer
http://bugs.python.org/file25687/REPORT_64BIT_2.7_3.2_writer
http://bugs.python.org/file25757/report_windows7
Warning: For the Windows benchmark, Python 3.3 is compiled in 32 bits,
whereas 2.7 and 3.2 are compiled in 64 bits (formatting integers is
slower in 32 bits).
--
UCS4 vs PyAccu vs _PyUnicodeWriter:
http://bugs.python.org/file25686/REPORT_32BIT_3.3
http://bugs.python.org/file25688/REPORT_64BIT_3.3
Victor
More information about the Python-Dev
mailing list