[Tutor] sys.getfilesystemencoding()

eryksun eryksun at gmail.com
Wed Dec 19 18:20:41 CET 2012


On Wed, Dec 19, 2012 at 5:43 AM, Albert-Jan Roskam <fomcl at yahoo.com> wrote:
>
>So MBCS is just a collective noun for whatever happens to be the
>installed/available codepage of the host computer (at least with
>CP_ACP)?

To be clear, the "mbcs" encoding in Python uses CP_ACP. MBCS means
multibyte character set. The term ANSI gets thrown around, too, but
Windows legacy code pages aren't ANSI standards.

>I didn't know anything about wintypes and I find it quite hard to
>understand! I am trying to write a ctypes wrapper for
>WideCharToMultiByte.

Just for the fun of it?

>http://pastebin.com/SEr4Wec9
>The code is kinda verbose, but I hope this makes it easier to read.
>Does this makes sense at all? As for now, the program returns an
>error code (oddly, zero is an error code here).

Use None for NULL.

You shouldn't encode a string argument you've declared as c_wchar_p
(i.e. wintypes.LPCWSTR, i.e. type 'Z'). If you initialize to a byte
string, the setter Z_set calls PyUnicode_FromEncodedObject using the
"mbcs" encoding (this is the default on Windows, set by
set_conversion_mode("mbcs", "ignore")). This hands off to decode_mbcs,
which produces nonsense for a UTF-16LE encoded string.

GetLastError should be defined already, along with WinError, a
convenience function that returns an instance of WindowsError. 2.6.4
source:

http://hg.python.org/cpython/file/8803c3d61da2/Lib/ctypes/__init__.py#l448

Here's a quick hack that you should help you along:

    import ctypes
    from ctypes import wintypes

    _CP_UTF8 = 65001
    _CP_ACP = 0  # ANSI
    _LPBOOL = ctypes.POINTER(ctypes.c_long)

    _wideCharToMultiByte = ctypes.windll.kernel32.WideCharToMultiByte
    _wideCharToMultiByte.restype = ctypes.c_int
    _wideCharToMultiByte.argtypes = [
      wintypes.UINT, wintypes.DWORD, wintypes.LPCWSTR, ctypes.c_int,
      wintypes.LPSTR, ctypes.c_int, wintypes.LPCSTR, _LPBOOL]

    def wide2utf8(fn):
        codePage = _CP_UTF8
        dwFlags = 0
        lpWideCharStr = fn
        cchWideChar = len(fn)
        lpMultiByteStr = None
        cbMultiByte = 0  # zero requests size
        lpDefaultChar = None
        lpUsedDefaultChar = None
        # get size
        mbcssize = _wideCharToMultiByte(
          codePage, dwFlags, lpWideCharStr, cchWideChar, lpMultiByteStr,
          cbMultiByte, lpDefaultChar, lpUsedDefaultChar)
        if mbcssize <= 0:
            raise ctypes.WinError(mbcssize)
        lpMultiByteStr = ctypes.create_string_buffer(mbcssize)
        # convert
        retcode = _wideCharToMultiByte(
          codePage, dwFlags, lpWideCharStr, cchWideChar, lpMultiByteStr,
          mbcssize, lpDefaultChar, lpUsedDefaultChar)
        if retcode <= 0:
            raise ctypes.WinError(retcode)
        return lpMultiByteStr.value


More information about the Tutor mailing list