Re: [Python-ideas] Fix default encodings on Windows
*On Fri Aug 12 11:33:35 EDT 2016, * *Random832 wrote:*> On Wed, Aug 10, 2016, at 15:08, Steve Dower wrote: >>* That's the hope, though that module approaches the solution differently *>>* and may still uses. An alternative way for us to fix this whole thing *>>* would be to bring win_unicode_console into the standard library and use *>>* it by default (or probably whenever PYTHONIOENCODING is not specified). *> > I have concerns about win_unicode_console: > - For the "text_transcoded" streams, stdout.encoding is utf-8. For the > "text" streams, it is utf-16. UTF-16 it the "native" encoding since it corresponds to the wide chars used by Read/WriteConsoleW. The UTF-8 is used just as a signal for the consumers of PyOS_Readline. > - There is no object, as far as I can find, which can be used as an > unbuffered unicode I/O object. There is no buffer just on those wrapping streams because the bytes I have are not in UTF-8. Adding one would mean a fake buffer that just decodes and writes to the text stream. AFAIK there is no guarantee that sys.std* objects have buffer attribute and any code relying on that is incorrect. But I inderstand that there may be such code and we may want to be compatible. > - raw output streams silently drop the last byte if an odd number of > bytes are written. That's not true, it doesn't write an odd number of bytes, but returns the correct number of bytes written. If only one byte is given, it raises a ValueError. > - The sys.stdout obtained via streams.enable does not support .buffer / > .buffer.raw / .detach > - All of these objects provide a fileno() interface. Is this wrong? If I remember, I provide it because of some check -- maybe in input() -- to be viewed as a stdio stream. > - When using os.read/write for data that represents text, the data still > should be encoded in the console encoding and not in utf-8 or utf-16. I don't know what to do with this. Generally I wouldn't use bytes to communicate textual data. Regards, Adam Bartoš
On Fri, Aug 12, 2016, at 12:24, Adam Bartoš wrote:
There is no buffer just on those wrapping streams because the bytes I have are not in UTF-8. Adding one would mean a fake buffer that just decodes and writes to the text stream. AFAIK there is no guarantee that sys.std* objects have buffer attribute and any code relying on that is incorrect. But I inderstand that there may be such code and we may want to be compatible.
Yes that's what I meant, I just think it needs to be considered if we're thinking about making it (or something like it) the default python sys.std*. Maybe the decision will be that maintaining compatibility with these cases isn't important.
- The sys.stdout obtained via streams.enable does not support .buffer / .buffer.raw / .detach - All of these objects provide a fileno() interface.
Is this wrong? If I remember, I provide it because of some check -- maybe in input() -- to be viewed as a stdio stream.
I don't know if it's *wrong* per se (same with the no buffer/raw thing etc), I'm just concerned about the possible effects on code that is written against the current implementation.
participants (2)
-
Adam Bartoš
-
Random832