<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body><div><div style="font-family: Calibri,sans-serif; font-size: 11pt;">Unless someone else does the implementation, I'd rather add a utf8-readsig encoding that initially only skips a utf8 BOM - notably, you always get the same encoding, it just sometimes skips the first three bytes.<br><br>I think we can change this later to detect and switch to utf16 without it being disastrous, though we've made it this far without it and frankly there are good reasons to "encourage" utf8 over utf16.<br><br>My big concern is the console... I think that change is inevitably going to have to break someone, but I need to map out the possibilities first to figure out just how bad it'll be.<br><br>Top-posted from my Windows Phone</div></div><div dir="ltr"><hr><span style="font-family: Calibri,sans-serif; font-size: 11pt; font-weight: bold;">From: </span><span style="font-family: Calibri,sans-serif; font-size: 11pt;"><a href="mailto:random832@fastmail.com">Random832</a></span><br><span style="font-family: Calibri,sans-serif; font-size: 11pt; font-weight: bold;">Sent: </span><span style="font-family: Calibri,sans-serif; font-size: 11pt;">‎8/‎11/‎2016 7:54</span><br><span style="font-family: Calibri,sans-serif; font-size: 11pt; font-weight: bold;">To: </span><span style="font-family: Calibri,sans-serif; font-size: 11pt;"><a href="mailto:python-ideas@python.org">python-ideas@python.org</a></span><br><span style="font-family: Calibri,sans-serif; font-size: 11pt; font-weight: bold;">Subject: </span><span style="font-family: Calibri,sans-serif; font-size: 11pt;">Re: [Python-ideas] Fix default encodings on Windows</span><br><br></div>On Thu, Aug 11, 2016, at 10:25, Steven D'Aprano wrote:<br>> > Interesting. Are you assuming that a text file cannot be empty?<br>> <br>> Hmmm... not consciously, but I guess I was.<br>> <br>> If the file is empty, how do you know it's text?<br><br>Heh. That's the *other* thing that Notepad does wrong in the opinion of<br>people coming from the Unix world - a Windows text file does not need to<br>end with a [CR]LF, and normally will not.<br><br>> But we're getting off topic here. In context of Steve's suggestion, we <br>> should only autodetect UTF-8. In other words, if there's a UTF-8 BOM, <br>> skip it, otherwise treat the file as UTF-8.<br><br>I think there's still room for UTF-16. It's two of the four encodings<br>supported by Notepad, after all.<br>_______________________________________________<br>Python-ideas mailing list<br>Python-ideas@python.org<br>https://mail.python.org/mailman/listinfo/python-ideas<br>Code of Conduct: http://python.org/psf/codeofconduct/<br></body></html>