Re: [Python-ideas] Fix default encodings on Windows

16 Aug 2016

      ...
Given that, I'm proposing adding support for using byte strings encoded with UTF-8 in file system functions on Windows. This allows Python users to omit switching code like:
if os.name == 'nt':
   f = os.stat(os.listdir('.')[-1])
else:
   f = os.stat(os.listdir(b'.')[-1])
REALLY? Do we really want to encourage using bytes as paths? IIUC,
anyone that wants to platform-independentify that code just needs to
use proper strings (or pat glib) for paths everywhere, yes?

I understand that pre-surrogate-escape, there was a need for bytes
paths, but those days are gone, yes?

So why, at this late date, kludge what should be a deprecated pattern
into the Windows build???

-CHB
...
My proposal is to remove all use of the *A APIs and only use the *W APIs. That completely removes the (already deprecated) use of bytes as paths.
Yes, this is good.
...
I then propose to change the (unused on Windows) sys.getfsdefaultencoding() to 'utf-8' and handle bytes being passed into filesystem functions by transcoding into UTF-16 and calling the *W APIs.
I'm really not sure utf-8 is magic enough to do this. Where do you
imagine that utf-8 is coming from as bytes???

AIUI, while utf-8 is almost universal in *nix for file system names,
folks do not want to count on it -- hence the use of bytes. And it is
far less prevalent in the Windows world...
...
, allows paths returned from the filesystem to correctly roundtrip via bytes in Python,
That you could do with native bytes (UTF-16, yes?)
...
. But that would prevent basic manipulation which seems to be a higher priority.)
Still think Unicode is the answer to that...
...
At this stage, it's time for us to either make byte paths an error,
+1.  :-)

CHB

Re: [Python-ideas] Fix default encodings on Windows

Chris Barker - NOAA Federal