[Tutor] sys.getfilesystemencoding()

eryksun eryksun at gmail.com
Tue Dec 18 16:07:49 CET 2012


On Tue, Dec 18, 2012 at 8:13 AM, Albert-Jan Roskam <fomcl at yahoo.com> wrote:
>
>In windows xp, the characters can, apparently, not be represented
>in this encoding called 'mbcs'.

MBCS (multibyte character set) refers to the locale encoding on
Windows. CPython encodes to MBCS via the Win32 function
WideCharToMultiByte, with the CP_ACP code page (i.e. the system
default 'ANSI' encoding):

unicodeobject.c, encode_mbcs (3877-3884):
http://hg.python.org/cpython/file/8803c3d61da2/Objects/unicodeobject.c#l3843

WideCharToMultiByte:
http://msdn.microsoft.com/en-us/library/windows/desktop/dd374130

MBCS could be a double byte encoding such as code page 936 (simplified Chinese):

http://msdn.microsoft.com/en-US/goglobal/cc305153

But generally if you're in the West it'll be Windows 1252:

http://en.wikipedia.org/wiki/Windows-1252

The Windows API, registry, and NT file system use UTF-16, with support
for legacy code pages. System calls that end in W are the (W)ide
interface. Calls that end in "A" are the ANSI wrappers. For example,
Kernel32 has both CreateFileW (UTF-16) and CreateFileA (ANSI). In C,
one uses CreateFile without the suffix; it depends on preprocessor
definitions. With ctypes you'd likely use windll.kernel32.CreatFileW
with "utf-16le" or "utf-16be" encoded arguments.


More information about the Tutor mailing list