Re: [Python-ideas] Fix default encodings on Windows

The main idea of win_unicode_console is simple: to use WinAPI functions ReadConsoleW and WriteConsoleW to communicate with the interactive console on Windows and to wrap this in standard Python IO hierarchy – that's why sys.std*.encoding would be 'utf-16-le': it corresponds to widechar strings used by Windows wide APIs. But this is only about sys.std*.encoding, which I think is not so imporant. AFAIK sys.std*.encoding should be used only when you want to communicate in bytes (which I think is not a good idea), so it tells you, which encoding is sys.std*.buffer assuming. In fact sys.std* may even not have the buffer attribute, so its encoding attribute would be useless in that case. Unfortunatelly, sys.std*.encoding is used in some other places – namely by the consumers of the old PyOS_Readline API (the tokenizer and input) use it to decode the bytes returned. Actually, the consumers assume differente encodings (sys.stdin.encoding vs. sys.stdout.encoding), so it is impossible to write a correct readline hook when the encodings are not the same. So I think it would be nice to have Python and string-based implementation of readline hooks – sys.readlinehook attribute, which would use sys.std* by default on Windows and GNU readline on Unix. Nevertheless, I think it is a good idea to have more 'utf-8' defaults (or 'utf-8-readsig' for open()). I don't know whether it helps with the console issue to open the standard streams in 'utf-8'. Adam Bartoš
participants (1)
-
Adam Bartoš