[Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874"

Stephen J. Turnbull turnbull.stephen.fw at u.tsukuba.ac.jp
Thu Jun 21 03:17:24 EDT 2018

Ronald Oussoren writes:

 > Possibly just for the “cp…” encodings, but IMHO only if we confirm
 > that the code to look for the preferred encoding returns a codepage
 > number on Windows and changing that code leads to worse results
 > than adding numeric aliases for the “cp…” encodings.

Almost all of the CPxxx encodings have multiple aliases[1], so I just
don't see the point unless numeric-only code page designations are
baked in to default "locales"[2] in official releases by major OS
vendors.  And probably not even then, since it should be easy enough
to provide a proper "locale" and/or PYTHONIOENCODING setting.

Of course we should help the reporter figure out what's going on and
help them fix it with appropriate system configuration.  If that
doesn't work, then (and *only then*) we could think about doing a
stupid thing.

[1]  Granted, "874" only has "windows-874" registered with the IANA,
so it's kind of salient.  Still, if numeric-only aliases were a
"thing", surely we'd have heard about it by now---I first encountered
Thai encodings in 1990 (ok, that was TIS 620, but windows-874 is
basically TIS plus Microsoft punctuation extensions IIRC), Thais do
use computers in their native language a lot.

[2]  Scare quotes to refer to appropriate platform facilities, as
neither Windows nor Mac OS is strictly conformant to POSIX on this.

More information about the Python-ideas mailing list