[Python-ideas] Fix default encodings on Windows
Paul Moore
p.f.moore at gmail.com
Thu Aug 11 05:07:42 EDT 2016
On 11 August 2016 at 01:41, Chris Angelico <rosuav at gmail.com> wrote:
> I've almost never seen files stored in UTF-32 (even UTF-16 isn't all
> that common compared to UTF-8), so I wouldn't stress too much about
> that. Recognizing FE FF or FF FE and decoding as UTF-16 might be worth
> doing, but it could easily be retrofitted (that byte sequence won't
> decode as UTF-8).
I see UTF-16 relatively often as a result of redirecting stdout in
Powershell and forgetting that it defaults (stupidly, IMO) to UTF-16.
>> The main problem here is that if the console is not forced to UTF-8 then it
>> won't render any of the characters correctly.
>
> Ehh, that's annoying. Is there a way to guarantee, at the process
> level, that the console will be returned to "normal state" when Python
> exits? If not, there's the risk that people run a Python program and
> then the *next* program gets into trouble.
There's also the risk that Python programs using subprocess.Popen
start the subprocess with the console in a non-standard state. Should
we be temporarily restoring the console codepage in that case? How
does the following work?
<start>
set codepage to UTF-8
...
set codepage back
spawn subprocess X, but don't wait for it
set codepage to UTF-8
...
... At this point what codepage does Python see? What codepage does
process X see? (Note that they are both sharing the same console).
...
<end>
restore codepage
Paul
More information about the Python-ideas
mailing list