Re: [Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue

On Tue, Sep 30, 2008 at 11:13 AM, Georg Brandl <g.brandl@gmx.net> wrote:
Victor Stinner schrieb:
On Windows, we might reject bytes filenames for all file operations: open(), unlink(), os.path.join(), etc. (raise a TypeError or UnicodeError)
Since I've seen no objections to this yet: please no. If we offer a "lower-level" bytes filename API, it should work for all platforms.
I'm not sure either way. I've heard it claim that Windows filesystem APIs use Unicode natively. Does Python 3.0 on Windows currently support filenames expressed as bytes? Are they encoded first before passing to the Unicode APIs? Using what encoding? -- --Guido van Rossum (home page: http://www.python.org/~guido/)

I'm not sure either way. I've heard it claim that Windows filesystem APIs use Unicode natively. Does Python 3.0 on Windows currently support filenames expressed as bytes?
Yes, it does (at least, os.open, os.stat support them, builtin open doesn't).
Are they encoded first before passing to the Unicode APIs? Using what encoding?
They aren't passed to the Unicode (W) APIs (by Python). Instead, they are passed to the "ANSI" (A) APIs (i.e. CP_ACP APIs). On Windows NT+, that API then converts it to Unicode through the CP_ACP (aka "mbcs") encoding; this is inside the system DLLs. CP_ACP is a lossy encoding (from Unicode to bytes): Microsoft uses replacement characters if they can, starting with similarly-looking characters, and falling back to question marks. Regards, Martin
participants (2)
-
"Martin v. Löwis"
-
Guido van Rossum