[New-bugs-announce] [issue2128] sys.argv is wrong for unicode strings
Giovanni Bajo
report at bugs.python.org
Sat Feb 16 17:27:46 CET 2008
New submission from Giovanni Bajo:
Under Windows, sys.argv is created through the Windows ANSI API.
When you have a file/directory which can't be represented in the
system encoding (eg: a japanese-named file or directory on a Western
Windows), Windows will encode the filename to the system encoding using
what we call the "replace" policy, and thus sys.argv[] will contain an
entry like "c:\\foo\\??????????????.dat".
My suggestion is that:
* At the Python level, we still expose a single sys.argv[], which will
contain unicode strings. I think this exactly matches what Py3k does now.
* At the C level, I believe it involves using GetCommandLineW() and
CommandLineToArgvW() in WinMain.c, but should Py_Main/PySys_SetArgv() be
changed to also accept wchar_t** arguments? Or is it better to allow for
NULL to be passed (under Windows at least), so that the Windows
code-path in there can use GetCommandLineW()/CommandLineToArgvW() to get
the current process' arguments?
----------
components: Interpreter Core
messages: 62458
nosy: giovannibajo
severity: normal
status: open
title: sys.argv is wrong for unicode strings
type: behavior
versions: Python 3.0
__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2128>
__________________________________
More information about the New-bugs-announce
mailing list