Chris Angelico writes:
Huh. Is that level of generality actually still needed? Can Python deprecate all but a small handful of encodings?
I think that's pointless. With few exceptions (GB18030, Big5 has a couple of code point pairs that encode the same very rare characters, ISO 2022 extensions) you're not going to run into the confuseables problem, and AFAIK the only generic BIDI solution is Unicode (the ISO 8859 encodings of Hebrew and Arabic do not have direction markers). What exactly are you thinking? The only thing I'd like to see is to rearrange the codec aliases so that the "common names" would denote the maximal repertoires in each family (gb denotes gb18030, sjis denotes shift_jisx0213, etc) as in the WhatWG recommendations for web browsers. But that's probably too backward incompatible to fly.