On 19 July 2013 20:48, Steve Dower <Steve.Dower@microsoft.com> wrote:
From: Oscar Benjamin I don't know whether or not you intend to have wrappers also work for Python 2.7 (in a third-party package perhaps) but there is a slightly subtle point to watch out for when non-ASCII characters in sys.argv come into play.
Python 2.x uses GetCommandLineA and 3.x uses GetCommandLineW. A wrapper to launch 2.x should use GetCommandLineA and CreateProcessA to ensure that the 8-bit argument strings are passed through unaltered. To launch 3.x it should use the W versions. If not then the MSVC runtime (or the OS?) will convert between the 8-bit and 16-bit encodings using its own lossy routines.
The launcher should always use GetCommandLineW, because the command line is already stored in a 16-bit encoding. GetCommandLineA will decode to an 8-bit encoding using some code page/settings (I can probably find out exactly which ones, but I don't know/care off the top of my head), and CreateProcessA will convert back using (hopefully) the same code page.
There is never any point passing data between *A APIs in Windows, because they are just doing the conversion in the background. All you gain is that the launcher will corrupt the command line before python.exe gets a chance to.
Okay, thanks for the correction. The issue that made me think this was to do with calling Python 2.x as a subprocess of 3.x and vice-versa. When I looked back at it now I saw that the problem was to do with explicitly encoding with sys.getfilesystemencoding() in Python and using the mbcs codec (which previously had no error handling apart from 'replace'). Oscar