
11 Aug
2016
11 Aug
'16
2:52 p.m.
On Thu, Aug 11, 2016, at 10:25, Steven D'Aprano wrote:
Interesting. Are you assuming that a text file cannot be empty?
Hmmm... not consciously, but I guess I was.
If the file is empty, how do you know it's text?
Heh. That's the *other* thing that Notepad does wrong in the opinion of people coming from the Unix world - a Windows text file does not need to end with a [CR]LF, and normally will not.
But we're getting off topic here. In context of Steve's suggestion, we should only autodetect UTF-8. In other words, if there's a UTF-8 BOM, skip it, otherwise treat the file as UTF-8.
I think there's still room for UTF-16. It's two of the four encodings supported by Notepad, after all.