[Python-Dev] File system path encoding on Windows

Stephen J. Turnbull turnbull.stephen.fw at u.tsukuba.ac.jp
Tue Aug 23 04:46:47 EDT 2016


eryk sun writes:

 > I just wrote a simple function to enumerate the 822 system locales on
 > my Windows box (using EnumSystemLocalesEx and GetLocaleInfoEx, which
 > are Unicode-only functions), and 36.7% of them lack an ANSI codepage.
 > They're Unicode-only locales. UTF-8 is the only way to support these
 > locales with a bytes API.

Are the users of those locales banging on our door demanding such an API?

Apparently not; such banging would have resulted in a patch.  (That's
how you know it's a bang and not a whimper!)  Instead, Steve had to
volunteer one.

Pragmatically, I don't see anyone rushing to *supply* bytes-oriented
APIs, bytes-oriented networking stacks, or bytes-oriented applications
to the Windows world.  I doubt there are all that many purely bytes-
oriented libraries out there that are plug-compatible with existing
Windows libraries of similar functionality, and obviously superior.
So somebody's going to have to do some work to exploit this new
feature.

Who, and when?  If the answers are "uh, I dunno" and "eventually",
what's the big rush?  Making it possible to test such software on
Windows in the public release version of Python should be our goal for
3.6.  We can do that with an option to set the default codecs to
'utf-8', and the default being the backward-compatible 'mbcs'.  How we
deal with the existing deprecation, I don't really care ("now is
better than never", and everything currently on the table will need a
policy kludge).

If in 9 months after release of 3.6, there are apps targeting Windows
and using UTF-8 bytes APIs in beta (or nearing it), then we have
excellent reason to default to 'utf-8' for 3.7.

And of course the patch eliminating use of the *A APIs with their lack
of error-handling deserves nothing but a round of applause!

Steve


More information about the Python-Dev mailing list