[issue8271] str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

STINNER Victor report at bugs.python.org
Wed Apr 7 11:02:04 CEST 2010


STINNER Victor <victor.stinner at haypocalc.com> added the comment:

> >> I also found out that, according to RFC 3629, surrogates
> >> are considered invalid and they can't be encoded/decoded,
> >> but the UTF-8 codec actually does it.
> >
> > Python2 does, but Python3 raises an error.
> > (...)
> 
> I wonder how that change got into the 3.x branch - I would certainly
> not have approved it for the reasons given further up on this ticket.
> 
> I think we should revert that change for Python 3.2.

See r72208 and issue #3672.

pitrou wrote "We could fix it for 3.1, and perhaps leave 2.7 unchanged if some 
people rely on this (for whatever reason)."

----------
title: str.decode('utf8',	'replace') -- conformance with Unicode 5.2.0 -> str.decode('utf8', 'replace') -- conformance with Unicode 5.2.0

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue8271>
_______________________________________


More information about the Python-bugs-list mailing list