[Python-ideas] changing sys.stdout encoding
Stephen J. Turnbull
stephen at xemacs.org
Wed Jun 6 05:28:57 CEST 2012
Amaury Forgeot d'Arc writes:
> 2012/6/5 Stephen J. Turnbull <stephen at xemacs.org>
> > I wouldn't object to a method with the semantics of reinitialization,
> > but it should have a name implying reinitialization. It probably
> > should also error if the stream is open and has been written to.
> What do you think of the following method TextIOWrapper.reset_encoding?
> (the assert statements should certainly be replaced by some
I think that it's an attractive nuisance because it doesn't close the
stream, and therefore permits changing the encoding without any
warning partway through the stream. There are two reasonable (for a
very generous definition of "reasonable"<wink/>) ways to handle
multiple scripts in one stream: Unicode and ISO 2022. Simply changing
encodings in the middle is a recipe for disaster in the absence of a
higher-level protocol for signaling this change (that's the role ISO
2022 fulfils, but it is detested by almost everybody...). If you want
to do that kind of thing, the "import codecs; sys.stdout = ..." idiom
is available, but I don't see a need to make it convenient.
But the OP's request is pretty clearly not for a generic
.set_encoding(), it's for a more convenient way to initialize the
stream for users.
Aside to Victor: at least on Mac OS X, I find that Python 3.2 (current
MacPorts, I can investigate further if you need it) doesn't respect
the language environment as I would expect it to. "LC_ALL=ja_JP.UTF8
python32" will give me an out-of-range Unicode error if I try to input
Japanese using "import sys; sys.stdin.readline()" -- I have to use
"PYTHONIOENCODING=UTF8" to get useful behavior.
There may also be cases where multiple users with different language
needs are working at the same workstation.
For both of these cases a command-line option to initialize the
encoding would be convenient.
More information about the Python-ideas