[Python-ideas] changing sys.stdout encoding
Rurpy
rurpy at yahoo.com
Wed Jun 6 08:05:35 CEST 2012
On 06/05/2012 01:37 PM, Stephen J. Turnbull wrote:
> Rurpy writes:
>
> > It is excessively complex for what is conceptually a simple and
> > straight-forward operation.
>
> The operation is not conceptually straightforward. The problem is
> that you can't just change the encoding of an open stream, encodings
> are generally stateful. The straightforward way to deal with this
> issue is to close the stream and reinitialize it. Your proposed
> .set_encoding() method implies something completely different about
> what's going on.
I'm not sure why stateful matters. When you change encoding
you discard whatever state exists and start with the new encoder
in it's initial state. If there is a partially en/decoded
character then wouldn't do the same thing you'd do if the same
condition arose at EOF?
> I wouldn't object to a method with the semantics of reinitialization,
> but it should have a name implying reinitialization. It probably
> should also error if the stream is open and has been written to.
>
> > Needing to change the encoding of a sys.std* stream is not an
> > uncommon need and a user should not have to go through the
> > codecs dance above to do so IMO.
>
> I suspect needing to *change* the encoding of an open stream is
> generally quite rare. Needing to *initialize* the std* streams with
> an appropriate codec is common. That's why it doesn't so much matter
> that PYTHONIOENCODING can't be changed within a program.
You are correct that my current concern is reinitializing
the encoding(s) of the sys.std* streams prior to doing any
operations with them. I thought that changing the encoding
at any point would be a straight-forward generalization.
However I have in the past encountered mixed encoding outputting
programs in two contexts; generating test data (i think is was
for automatic detection and extraction of information), and
bundling multiple differently-encoded data sets in one package
that were pulled apart again downstream
That both uses probably could have been designed better is irrelevant;
a hypothetical python programmer's job would have been to produce
a python program that would fit into the the existing processes.
However I don't want to dwell on this because it is not my main
concern now, I thought I would just mention it for the record.
> I agree that use of PYTHONIOENCODING is pretty awkward.
More information about the Python-ideas
mailing list