[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

John Machin report at bugs.python.org
Sat Jul 3 11:36:44 CEST 2010


John Machin <sjmachin at users.sourceforge.net> added the comment:

About the E0 80 81 61 problem: my interpretation is that you are correct, the 80 is not valid in the current state (start byte == E0), so no look-ahead, three FFFDs must be issued followed by 0061. I don't really care about issuing too many FFFDs so long as it doesn't munch valid sequences. However it would be very nice to get an explicit message about surrogates.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8271>
_______________________________________


More information about the Python-bugs-list mailing list