[issue14923] Even faster UTF-8 decoding

Antoine Pitrou report at bugs.python.org
Sat May 26 11:19:47 CEST 2012


Antoine Pitrou <pitrou at free.fr> added the comment:

I see a slight increase under 64-bit Linux with gcc 4.5.2, too:

                                          vanilla       patched

utf-8     'A'*10000                       7857 (+4%)	8210
utf-8         'A'*9999+'\x80'             5392 (+8%)	5843
utf-8         'A'*9999+'\u0100'           2119 (+3%)	2173
utf-8         'A'*9999+'\u8000'           2121 (+2%)	2172
utf-8         'A'*9999+'\U00010000'       2248 (+2%)	2293
utf-8     '\x80'*10000                    1015 (+1%)	1021
utf-8       '\x80'+'A'*9999               2747 (+5%)	2877
utf-8         '\x80'*9999+'\u0100'        868 (+0%)	869
utf-8         '\x80'*9999+'\u8000'        857 (+2%)	870
utf-8         '\x80'*9999+'\U00010000'    877 (+0%)	881
utf-8     '\u0100'*10000                  1016 (+16%)	1181
utf-8       '\u0100'+'A'*9999             2506 (+3%)	2592
utf-8       '\u0100'+'\x80'*9999          1015 (+16%)	1179
utf-8         '\u0100'*9999+'\u8000'      1015 (+16%)	1182
utf-8         '\u0100'*9999+'\U00010000'  875 (+13%)	992
utf-8     '\u8000'*10000                  836 (+18%)	985
utf-8       '\u8000'+'A'*9999             2508 (+3%)	2588
utf-8       '\u8000'+'\x80'*9999          1015 (+16%)	1182
utf-8       '\u8000'+'\u0100'*9999        1014 (+17%)	1182
utf-8         '\u8000'*9999+'\U00010000'  767 (+17%)	894
utf-8     '\U00010000'*10000              730 (+0%)	732
utf-8       '\U00010000'+'A'*9999         2542 (+2%)	2599
utf-8       '\U00010000'+'\x80'*9999      1013 (+17%)	1182
utf-8       '\U00010000'+'\u0100'*9999    1013 (+17%)	1181
utf-8       '\U00010000'+'\u8000'*9999    727 (+0%)	728

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14923>
_______________________________________


More information about the Python-bugs-list mailing list