[issue24214] UTF-8 incremental decoder doesn't support surrogatepass correctly
Serhiy Storchaka
report at bugs.python.org
Tue Aug 2 14:53:51 EDT 2016
Serhiy Storchaka added the comment:
The patch slows down decoding up to 20%.
$ ./python -m timeit -s 'b = b"\xc4\x80"*10000' -- 'b.decode()'
Unpatched: 10000 loops, best of 3: 50.8 usec per loop
Patched: 10000 loops, best of 3: 63.3 usec per loop
And I'm not sure that fixing only for the surrogatepass handler is enough. Other standard error handlers look working, but what if a user handler consumes more then one byte?
----------
components: +Interpreter Core
priority: normal -> high
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue24214>
_______________________________________
More information about the Python-bugs-list
mailing list