[Python-ideas] [issue33865] [EASY] Missing code page aliases: "unknown encoding: 874"
Stephen J. Turnbull
turnbull.stephen.fw at u.tsukuba.ac.jp
Mon Jun 25 10:50:02 EDT 2018
Ronald Oussoren writes:
> The user shouldn’t have to do anything other than install Python. IMHO
> were doing something wrong when the python interpreter doesn’t start up
> with a default system configuration
There's no evidence in the issue that I can see that suggests that the
user installed Python into the default system configuration. I see a
bunch of Python developers who have no access to the OP's system
configuration demonstrating that something that shouldn't work and never
has worked doesn't work, then providing a patch to make it work. This
despite the fact that the OP hasn't provided any configuration details
that would confirm this is a system default setting.
I wouldn't object to making it work if there were any evidence that it
is a real problem that other users will encounter. But there isn't any
such evidence yet, it's a non-standard alias according to Microsoft's
own IANA registration, and Steven d'Aprano's argument that such aliases
may be ambiguous is plausible, though I haven't seen confirmation it
would be problem in practice.
> (when the user explicitly sets a bogus PYTHONIOENCODING or locale all
> bets are off,
I'm assuming that is the case, based on the fact that none of my two
;-) Thai students ever had this problem, nor have I seen a report of
this problem for any encoding in either Emacs or Python contexts since
about 1990, nor has the OP posted anything about his/her
> although even then warning about and then ignoring bad settings
> would be more userfriendly than the current behavior)
If Python is told to talk YTREWQ and it doesn't know how to talk YTREWQ,
ignoring the problem is not possible if any input or output in YTREWQ is
required. The program will crash with a much harder to understand error
message describing "undecodable input" in an encoding the user doesn't
expect. My own experience is that soldiering on is the least user-
friendly thing to do, as typically there's a trivial change that the
user can make to resolve the problem optimally.
The obvious thing to do is to fall back to ASCII, which almost certainly
is compatible with the terminal, the log files, and the user's eyes and
brain, emit a warning, and quit. That is what we do. The warning seems
OK: the OP also diagnosed the missing alias, likely with little trouble.
More information about the Python-ideas