[New-bugs-announce] [issue15027] Faster UTF-32 encoding
Serhiy Storchaka
report at bugs.python.org
Thu Jun 7 15:57:31 CEST 2012
New submission from Serhiy Storchaka <storchaka at gmail.com>:
In pair to issue14625 here is a patch than speed up UTF-32 encoding in several times. In addition, it fixes an unsafe check of an integer overflow.
Here are the results of benchmarking. See benchmark tools in https://bitbucket.org/storchaka/cpython-stuff repository.
On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz:
Py2.7 Py3.2 Py3.3 patched
541 (+1032%) 541 (+1032%) 844 (+626%) 6125 encode utf-32le 'A'*10000
543 (+1056%) 541 (+1060%) 844 (+643%) 6275 encode utf-32le '\x80'*10000
544 (+1010%) 542 (+1014%) 843 (+616%) 6037 encode utf-32le '\x80'+'A'*9999
541 (+799%) 542 (+797%) 764 (+537%) 4864 encode utf-32le '\u0100'*10000
544 (+781%) 542 (+784%) 767 (+525%) 4793 encode utf-32le '\u0100'+'A'*9999
544 (+789%) 542 (+792%) 766 (+531%) 4834 encode utf-32le '\u0100'+'\x80'*9999
542 (+799%) 541 (+801%) 764 (+538%) 4874 encode utf-32le '\u8000'*10000
544 (+779%) 542 (+782%) 767 (+523%) 4780 encode utf-32le '\u8000'+'A'*9999
544 (+793%) 542 (+796%) 766 (+534%) 4859 encode utf-32le '\u8000'+'\x80'*9999
544 (+819%) 542 (+823%) 766 (+553%) 5001 encode utf-32le '\u8000'+'\u0100'*9999
430 (+867%) 427 (+874%) 860 (+383%) 4157 encode utf-32le '\U00010000'*10000
543 (+655%) 543 (+655%) 861 (+376%) 4101 encode utf-32le '\U00010000'+'A'*9999
543 (+658%) 543 (+658%) 861 (+378%) 4116 encode utf-32le '\U00010000'+'\x80'*9999
543 (+670%) 543 (+670%) 859 (+387%) 4180 encode utf-32le '\U00010000'+'\u0100'*9999
543 (+666%) 543 (+666%) 860 (+383%) 4158 encode utf-32le '\U00010000'+'\u8000'*9999
541 (+880%) 543 (+876%) 844 (+528%) 5300 encode utf-32be 'A'*10000
541 (+872%) 542 (+870%) 844 (+523%) 5256 encode utf-32be '\x80'*10000
544 (+843%) 542 (+846%) 843 (+509%) 5130 encode utf-32be '\x80'+'A'*9999
541 (+363%) 542 (+362%) 764 (+228%) 2505 encode utf-32be '\u0100'*10000
544 (+366%) 542 (+368%) 766 (+231%) 2534 encode utf-32be '\u0100'+'A'*9999
544 (+363%) 542 (+365%) 766 (+229%) 2519 encode utf-32be '\u0100'+'\x80'*9999
542 (+363%) 541 (+364%) 764 (+228%) 2509 encode utf-32be '\u8000'*10000
544 (+366%) 542 (+368%) 766 (+231%) 2534 encode utf-32be '\u8000'+'A'*9999
544 (+363%) 542 (+364%) 766 (+229%) 2517 encode utf-32be '\u8000'+'\x80'*9999
544 (+372%) 542 (+374%) 766 (+235%) 2568 encode utf-32be '\u8000'+'\u0100'*9999
430 (+428%) 427 (+432%) 860 (+164%) 2270 encode utf-32be '\U00010000'*10000
543 (+317%) 541 (+318%) 861 (+163%) 2262 encode utf-32be '\U00010000'+'A'*9999
543 (+320%) 541 (+321%) 861 (+165%) 2279 encode utf-32be '\U00010000'+'\x80'*9999
543 (+322%) 541 (+323%) 859 (+167%) 2290 encode utf-32be '\U00010000'+'\u0100'*9999
543 (+322%) 541 (+324%) 860 (+167%) 2292 encode utf-32be '\U00010000'+'\u8000'*9999
----------
components: Interpreter Core, Unicode
files: encode-utf32.patch
keywords: patch
messages: 162474
nosy: Arfrever, asvetlov, ezio.melotti, haypo, pitrou, storchaka
priority: normal
severity: normal
status: open
title: Faster UTF-32 encoding
type: performance
versions: Python 3.3
Added file: http://bugs.python.org/file25857/encode-utf32.patch
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue15027>
_______________________________________
More information about the New-bugs-announce
mailing list