[New-bugs-announce] [issue15026] Faster UTF-16 encoding
Serhiy Storchaka
report at bugs.python.org
Thu Jun 7 15:56:14 CEST 2012
New submission from Serhiy Storchaka <storchaka at gmail.com>:
In pair to issue14624 here is a patch than speed up UTF-16 encoding in several times. In addition, it fixes an unsafe check of an integer overflow.
Here are the results of benchmarking. See benchmark tools in https://bitbucket.org/storchaka/cpython-stuff repository.
On 32-bit Linux, AMD Athlon 64 X2 4600+ @ 2.4GHz:
Py2.7 Py3.2 Py3.3 patched
457 (+575%) 458 (+573%) 1077 (+186%) 3083 encode utf-16le 'A'*10000
457 (+579%) 493 (+529%) 1084 (+186%) 3102 encode utf-16le '\x80'*10000
489 (+534%) 458 (+577%) 1081 (+187%) 3102 encode utf-16le '\x80'+'A'*9999
457 (+1261%) 493 (+1161%) 1116 (+457%) 6219 encode utf-16le '\u0100'*10000
489 (+1266%) 458 (+1358%) 1126 (+493%) 6678 encode utf-16le '\u0100'+'A'*9999
489 (+1263%) 458 (+1355%) 1129 (+490%) 6666 encode utf-16le '\u0100'+'\x80'*9999
457 (+1240%) 493 (+1142%) 1118 (+448%) 6125 encode utf-16le '\u8000'*10000
489 (+1271%) 458 (+1363%) 1127 (+495%) 6702 encode utf-16le '\u8000'+'A'*9999
489 (+1271%) 458 (+1364%) 1129 (+494%) 6705 encode utf-16le '\u8000'+'\x80'*9999
489 (+1135%) 458 (+1218%) 1136 (+432%) 6038 encode utf-16le '\u8000'+'\u0100'*9999
498 (+128%) 505 (+125%) 630 (+80%) 1137 encode utf-16le '\U00010000'*10000
489 (+35%) 458 (+44%) 360 (+83%) 659 encode utf-16le '\U00010000'+'A'*9999
489 (+35%) 458 (+44%) 359 (+84%) 660 encode utf-16le '\U00010000'+'\x80'*9999
489 (+36%) 458 (+45%) 361 (+84%) 663 encode utf-16le '\U00010000'+'\u0100'*9999
489 (+36%) 458 (+45%) 361 (+84%) 663 encode utf-16le '\U00010000'+'\u8000'*9999
447 (+507%) 493 (+450%) 1086 (+150%) 2712 encode utf-16be 'A'*10000
447 (+513%) 493 (+456%) 1080 (+154%) 2739 encode utf-16be '\x80'*10000
489 (+458%) 458 (+496%) 1079 (+153%) 2729 encode utf-16be '\x80'+'A'*9999
447 (+498%) 494 (+441%) 1118 (+139%) 2672 encode utf-16be '\u0100'*10000
489 (+464%) 458 (+502%) 1128 (+144%) 2756 encode utf-16be '\u0100'+'A'*9999
489 (+463%) 458 (+502%) 1131 (+144%) 2755 encode utf-16be '\u0100'+'\x80'*9999
447 (+500%) 493 (+444%) 1119 (+139%) 2680 encode utf-16be '\u8000'*10000
489 (+463%) 458 (+502%) 1126 (+145%) 2755 encode utf-16be '\u8000'+'A'*9999
489 (+464%) 458 (+502%) 1129 (+144%) 2757 encode utf-16be '\u8000'+'\x80'*9999
489 (+479%) 458 (+518%) 1137 (+149%) 2829 encode utf-16be '\u8000'+'\u0100'*9999
499 (+102%) 506 (+99%) 630 (+60%) 1009 encode utf-16be '\U00010000'*10000
489 (+6%) 458 (+13%) 360 (+44%) 519 encode utf-16be '\U00010000'+'A'*9999
489 (+6%) 458 (+13%) 359 (+44%) 518 encode utf-16be '\U00010000'+'\x80'*9999
489 (+6%) 458 (+13%) 361 (+44%) 519 encode utf-16be '\U00010000'+'\u0100'*9999
489 (+6%) 458 (+13%) 361 (+44%) 519 encode utf-16be '\U00010000'+'\u8000'*9999
----------
components: Interpreter Core, Unicode
files: encode-utf16.patch
keywords: patch
messages: 162473
nosy: Arfrever, asvetlov, ezio.melotti, haypo, pitrou, storchaka
priority: normal
severity: normal
status: open
title: Faster UTF-16 encoding
type: performance
versions: Python 3.3
Added file: http://bugs.python.org/file25856/encode-utf16.patch
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue15026>
_______________________________________
More information about the New-bugs-announce
mailing list