[Python-Dev] Use our strict mbcs codec instead of the Windows ANSI API

Victor Stinner victor.stinner at haypocalc.com
Tue Oct 25 10:52:43 CEST 2011


Le Mardi 25 Octobre 2011 09:09:56 vous avez écrit :
> If it's the byte APIs (i.e. using bytes as file names), then I'm -1 on
> this proposal. People that explicitly use bytes for file names deserve
> to get whatever exact platform semantics the platform has to offer. This
> is true on Unix, and it is also true on Windows.

For your information, it took me something like 3 months (when I was working 
on the issue #12281) to understand exactly how Windows handles undecodable 
bytes and unencodable characters. I did a lot of tests on different Windows 
versions (XP, Vista and Seven, the behaviour changed in Windows Vista). I had 
to take notes because it is really complex. Well, I wanted to understand 
exactly *all* code pages, including CP_UTF7 and CP_UTF8, not only the most 
common ones like cp1252 or cp932.

See the dedicated section in my book to learn more about these funtions:

http://www.haypocalc.com/tmp/unicode-2011-07-20/html/operating_systems.html#encode-
and-decode-functions

Some information are available in MultiByteToWideChar and WideCharToMultiByte 
documentation, but they are not well explained :-p

Victor


More information about the Python-Dev mailing list