[issue24214] UTF-8 incremental decoder doesn't support surrogatepass correctly

Serhiy Storchaka report at bugs.python.org
Tue Aug 2 14:53:51 EDT 2016


Serhiy Storchaka added the comment:

The patch slows down decoding up to 20%.

$ ./python -m timeit -s 'b = b"\xc4\x80"*10000' -- 'b.decode()'
Unpatched:  10000 loops, best of 3: 50.8 usec per loop
Patched:    10000 loops, best of 3: 63.3 usec per loop

And I'm not sure that fixing only for the surrogatepass handler is enough. Other standard error handlers look working, but what if a user handler consumes more then one byte?

----------
components: +Interpreter Core
priority: normal -> high

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue24214>
_______________________________________


More information about the Python-bugs-list mailing list