
On Wed, Jan 31, 2018 at 3:03 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
19.01.18 05:51, Guido van Rossum пише:
Can someone explain to me why this is such a controversial issue?
It seems reasonable to me to add new encodings to the stdlib that do the roundtripping requested in the first message of the thread. As long as they have new names that seems to fall under "practicality beats purity". (Modifying existing encodings seems wrong -- did the feature request somehow transmogrify into that?)
In any case you need to change your code. If add new error handler -- you need to change the decoding code to use this error handler:
text = data.decode(encoding, 'whatwgreplace')
If add new encodings -- you need to support an alias table that maps standard encoding names to corresponding names of WHATWG encoding:
aliases = {'windows_1252': 'windows-1252-whatwg', 'windows_1251': 'windows-1251-whatwg', 'utf_8': 'utf-8-whatwg', # utf-8 + surrogatepass ... } ... text = data.decode(aliases.get(normalize_encoding(encoding), encoding))
I don't see an advantage of the second approach for the end user. And of course it is more costly for maintainers, because we will need to implement around 20 new encodings, and adds a cognitive burden for new Python users, which now have more tables of encodings in the documentation.
Hm. As a user, unless I run into problems with a specific encoding, I never care about how many encodings we have, so I don't see how adding extra encodings bothers those users who have no need for them. There's a reason to prefer new encoding names (maybe augmented with alias table) over a new error handler: there are lots of places where encodings are passed around via text files, Internet protocols, RPC calls, layers and layers of function calls. Many of these treat the encoding as a string, not as a (string, errorhandler) pair. So there may be situations where there is no way in a given API to preserve the need for using a special error handler, while the API would not have a problem preserving just the encoding name. -- --Guido van Rossum (python.org/~guido)