[issue12892] UTF-16 and UTF-32 codecs should reject (lone) surrogates

Serhiy Storchaka report at bugs.python.org
Tue Oct 8 12:28:07 CEST 2013


Serhiy Storchaka added the comment:

I repeat myself. Even with the patch, UTF-16 codec is faster than UTF-8 codec (except ASCII-only data). This is fastest Unicode codec in Python (perhaps UTF-32 can be made faster, but this is another issue).

> The real question is: Can the UTF-16/32 codecs be made fast
> while still detecting lone surrogates ? Not whether UTF-16
> is widely used or not.

Yes, they can. But let defer this to other issues.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12892>
_______________________________________


More information about the Python-bugs-list mailing list