[Tutor] (no subject)

eryk sun eryksun at gmail.com
Mon Jul 25 07:55:20 EDT 2016


On Mon, Jul 25, 2016 at 10:28 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> I know that Linux and Mac OS X both use UTF-8 for filenames, and so support all
> of Unicode. But Windows, I'm not sure. It might be localised to only use Latin-1
> (Western European) or similar.

Windows filesystems (e.g. NTFS, ReFS, UDF, exFAT, FAT32) use Unicode
[1], i.e. UTF-16, as does the Windows wide-character API. Using 16-bit
wchar_t strings is a problem for C standard functions such as fopen,
which require 8-bit null-terminated strings, so the Windows C runtime
also supports wide-character alternatives such as _wfopen.

Python's os and io functions use Windows wide-character APIs for
unicode arguments, even in 2.x. Unfortunately some 2.x modules such as
subprocess use only the 8-bit API (e.g. 2.x Popen calls CreateProcessA
instead of CreateProcessW).

[1]: https://msdn.microsoft.com/en-us/library/ee681827#limits


More information about the Tutor mailing list