[New-bugs-announce] [issue16334] Faster unicode-escape and raw-unicode-escape codecs
Serhiy Storchaka
report at bugs.python.org
Sat Oct 27 00:48:25 CEST 2012
New submission from Serhiy Storchaka:
The proposed patch optimizes unicode-escape and raw-unicode-escape codecs. Coders still slower than in 3.2, but much faster than in 3.3. Further speedup is possible with the use of stringlib, but I think that this is enough. The code unified and simplified (251 insertions, 345 deletions).
Benchmark results (on AMD Athlon 64 X2 4600+):
Py2.7 Py3.2 Py3.3 Py3.4+patch
193 (+11%) 325 (-34%) 66 (+224%) 214 decode unicode-escape 'A'*10000
138 (+72%) 241 (-1%) 154 (+55%) 238 decode unicode-escape '\x80'*10000
193 (+10%) 323 (-34%) 72 (+194%) 212 decode unicode-escape '\x80'+'A'*9999
160 (+59%) 273 (-7%) 169 (+51%) 255 decode unicode-escape '\u0100'*10000
193 (-7%) 324 (-44%) 61 (+195%) 180 decode unicode-escape '\u0100'+'A'*9999
138 (+67%) 242 (-5%) 135 (+71%) 231 decode unicode-escape '\u0100'+'\x80'*9999
160 (+59%) 274 (-7%) 169 (+51%) 255 decode unicode-escape '\u8000'*10000
193 (-7%) 323 (-44%) 61 (+195%) 180 decode unicode-escape '\u8000'+'A'*9999
138 (+67%) 242 (-5%) 135 (+71%) 231 decode unicode-escape '\u8000'+'\x80'*9999
160 (+60%) 276 (-7%) 169 (+51%) 256 decode unicode-escape '\u8000'+'\u0100'*9999
178 (+42%) 275 (-8%) 177 (+43%) 253 decode unicode-escape '\U00010000'*10000
192 (+30%) 323 (-23%) 61 (+310%) 250 decode unicode-escape '\U00010000'+'A'*9999
139 (+35%) 243 (-23%) 119 (+57%) 187 decode unicode-escape '\U00010000'+'\x80'*9999
161 (+38%) 273 (-19%) 150 (+48%) 222 decode unicode-escape '\U00010000'+'\u0100'*9999
161 (+38%) 273 (-19%) 150 (+48%) 222 decode unicode-escape '\U00010000'+'\u8000'*9999
558 (-62%) 427 (-50%) 82 (+161%) 214 decode raw-unicode-escape 'A'*10000
560 (-62%) 425 (-50%) 75 (+183%) 212 decode raw-unicode-escape '\x80'*10000
558 (-62%) 425 (-50%) 75 (+183%) 212 decode raw-unicode-escape '\x80'+'A'*9999
178 (+75%) 235 (+32%) 108 (+188%) 311 decode raw-unicode-escape '\u0100'*10000
555 (-62%) 424 (-50%) 61 (+248%) 212 decode raw-unicode-escape '\u0100'+'A'*9999
559 (-62%) 424 (-50%) 61 (+248%) 212 decode raw-unicode-escape '\u0100'+'\x80'*9999
179 (+74%) 235 (+32%) 108 (+188%) 311 decode raw-unicode-escape '\u8000'*10000
555 (-62%) 424 (-50%) 61 (+248%) 212 decode raw-unicode-escape '\u8000'+'A'*9999
558 (-62%) 425 (-50%) 61 (+248%) 212 decode raw-unicode-escape '\u8000'+'\x80'*9999
178 (+75%) 235 (+32%) 108 (+188%) 311 decode raw-unicode-escape '\u8000'+'\u0100'*9999
200 (+18%) 249 (-5%) 132 (+79%) 236 decode raw-unicode-escape '\U00010000'*10000
554 (-58%) 423 (-46%) 61 (+277%) 230 decode raw-unicode-escape '\U00010000'+'A'*9999
558 (-59%) 424 (-46%) 61 (+277%) 230 decode raw-unicode-escape '\U00010000'+'\x80'*9999
178 (+46%) 235 (+11%) 100 (+160%) 260 decode raw-unicode-escape '\U00010000'+'\u0100'*9999
178 (+44%) 235 (+9%) 100 (+157%) 257 decode raw-unicode-escape '\U00010000'+'\u8000'*9999
182 (+137%) 215 (+101%) 148 (+192%) 432 encode unicode-escape 'A'*10000
582 (-10%) 617 (-16%) 470 (+11%) 521 encode unicode-escape '\x80'*10000
182 (+131%) 215 (+96%) 148 (+184%) 421 encode unicode-escape '\x80'+'A'*9999
624 (-7%) 967 (-40%) 558 (+4%) 579 encode unicode-escape '\u0100'*10000
183 (-19%) 215 (-31%) 132 (+12%) 148 encode unicode-escape '\u0100'+'A'*9999
584 (-23%) 617 (-27%) 464 (-3%) 451 encode unicode-escape '\u0100'+'\x80'*9999
627 (-8%) 968 (-40%) 557 (+4%) 579 encode unicode-escape '\u8000'*10000
183 (-19%) 215 (-31%) 148 (+0%) 148 encode unicode-escape '\u8000'+'A'*9999
584 (-23%) 617 (-27%) 490 (-8%) 451 encode unicode-escape '\u8000'+'\x80'*9999
629 (-8%) 969 (-40%) 555 (+4%) 578 encode unicode-escape '\u8000'+'\u0100'*9999
931 (-39%) 939 (-39%) 602 (-5%) 572 encode unicode-escape '\U00010000'*10000
183 (+7%) 215 (-9%) 180 (+9%) 196 encode unicode-escape '\U00010000'+'A'*9999
584 (-9%) 617 (-13%) 482 (+11%) 534 encode unicode-escape '\U00010000'+'\x80'*9999
630 (-14%) 962 (-43%) 565 (-4%) 544 encode unicode-escape '\U00010000'+'\u0100'*9999
630 (-14%) 964 (-44%) 565 (-4%) 544 encode unicode-escape '\U00010000'+'\u8000'*9999
332 (+1459%) 330 (+1468%) 333 (+1454%) 5175 encode raw-unicode-escape 'A'*10000
332 (+1589%) 329 (+1604%) 333 (+1584%) 5607 encode raw-unicode-escape '\x80'*10000
336 (+1569%) 334 (+1579%) 333 (+1584%) 5607 encode raw-unicode-escape '\x80'+'A'*9999
904 (-38%) 911 (-39%) 557 (+0%) 558 encode raw-unicode-escape '\u0100'*10000
336 (+15%) 335 (+16%) 197 (+97%) 388 encode raw-unicode-escape '\u0100'+'A'*9999
335 (+16%) 335 (+16%) 197 (+97%) 388 encode raw-unicode-escape '\u0100'+'\x80'*9999
904 (-38%) 913 (-39%) 557 (+0%) 558 encode raw-unicode-escape '\u8000'*10000
335 (+16%) 335 (+16%) 197 (+96%) 387 encode raw-unicode-escape '\u8000'+'A'*9999
335 (+16%) 335 (+16%) 196 (+98%) 388 encode raw-unicode-escape '\u8000'+'\x80'*9999
912 (-39%) 909 (-39%) 554 (+1%) 558 encode raw-unicode-escape '\u8000'+'\u0100'*9999
966 (-40%) 997 (-42%) 584 (-0%) 583 encode raw-unicode-escape '\U00010000'*10000
336 (-42%) 335 (-41%) 213 (-8%) 196 encode raw-unicode-escape '\U00010000'+'A'*9999
336 (-42%) 335 (-41%) 213 (-8%) 196 encode raw-unicode-escape '\U00010000'+'\x80'*9999
911 (-43%) 911 (-43%) 570 (-8%) 522 encode raw-unicode-escape '\U00010000'+'\u0100'*9999
911 (-43%) 913 (-43%) 570 (-8%) 522 encode raw-unicode-escape '\U00010000'+'\u8000'*9999
----------
components: Interpreter Core, Unicode
files: faster_unicode_escape.patch
keywords: 3.3regression, patch
messages: 173901
nosy: benjamin.peterson, ezio.melotti, haypo, lemburg, pitrou, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Faster unicode-escape and raw-unicode-escape codecs
type: performance
versions: Python 3.4
Added file: http://bugs.python.org/file27740/faster_unicode_escape.patch
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue16334>
_______________________________________
More information about the New-bugs-announce
mailing list