[Python-ideas] Support WHATWG versions of legacy encodings

Stephen J. Turnbull turnbull.stephen.fw at u.tsukuba.ac.jp
Wed Jan 17 00:37:18 EST 2018


Random832 writes:

 > There are plenty of standard encodings that do have actual
 > representations of the control characters.

My complaint was not about coded character sets that don't conform to
ISO 2022's conventions about control vs. graphic blocks, especially in
the C1 block.  It was about promoting *unassigned* codes to the
Unicode scalars with the same integer values.  These codes don't
correspond to characters.  They are undefined as far as codecs are
concerned.

In the case of windows-125x charsets, even though they are IANA
registered, Microsoft reserves the right to change and even ignore the
published repertoire without updating it.  There I think it's
reasonable to use WHAT-WG graphic character repertoires even in
Python's stdlib codecs, and I wouldn't be surprised if Microsoft was
willing to delegate definition of those repertoires to the WG in the
end.



More information about the Python-ideas mailing list