[Python-ideas] Fix default encodings on Windows

Random832 random832 at fastmail.com
Thu Aug 11 10:52:53 EDT 2016


On Thu, Aug 11, 2016, at 10:25, Steven D'Aprano wrote:
> > Interesting. Are you assuming that a text file cannot be empty?
> 
> Hmmm... not consciously, but I guess I was.
> 
> If the file is empty, how do you know it's text?

Heh. That's the *other* thing that Notepad does wrong in the opinion of
people coming from the Unix world - a Windows text file does not need to
end with a [CR]LF, and normally will not.

> But we're getting off topic here. In context of Steve's suggestion, we 
> should only autodetect UTF-8. In other words, if there's a UTF-8 BOM, 
> skip it, otherwise treat the file as UTF-8.

I think there's still room for UTF-16. It's two of the four encodings
supported by Notepad, after all.


More information about the Python-ideas mailing list