[Python-Dev] [Python-3000] New proposition for Python3 bytes filename issue
"Martin v. Löwis"
martin at v.loewis.de
Tue Sep 30 22:45:55 CEST 2008
> I'm not sure either way. I've heard it claim that Windows filesystem
> APIs use Unicode natively. Does Python 3.0 on Windows currently
> support filenames expressed as bytes?
Yes, it does (at least, os.open, os.stat support them, builtin open
doesn't).
> Are they encoded first before
> passing to the Unicode APIs? Using what encoding?
They aren't passed to the Unicode (W) APIs (by Python). Instead, they
are passed to the "ANSI" (A) APIs (i.e. CP_ACP APIs). On Windows NT+,
that API then converts it to Unicode through the CP_ACP (aka "mbcs")
encoding; this is inside the system DLLs.
CP_ACP is a lossy encoding (from Unicode to bytes): Microsoft uses
replacement characters if they can, starting with similarly-looking
characters, and falling back to question marks.
Regards,
Martin
More information about the Python-Dev
mailing list