[issue10945] bdist_wininst depends on MBCS codec, unavailable on non-Windows

STINNER Victor report at bugs.python.org
Fri Oct 14 13:59:26 CEST 2011


STINNER Victor <victor.stinner at haypocalc.com> added the comment:

> It is not code under the users’ control (i.e. setup.py)
> that uses MBCS, but the bdist_wininst command itself.

bdist_command append configuration data to a wininst-xxx.exe binary. Where does this file come from? Can we modify wininst-xxx.exe binaries?


If we can modify the binaries, we can change the format to store the configuration data as UTF-8 instead of the ANSI code page.

It's surprising and "unsafe" (not portable) to use the ANSI code page for an installer: if you build your installer on a french setup (ANSI=cp1252), the configuration was be interpreted incorrectly on a japanese setup (ANSI=cp932). Example:

>>> 'Hé ho'.encode('cp1252').decode('cp932')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'cp932' codec can't decode bytes in position 1-2: illegal multibyte sequence

So if the configuration data (package metadata) contains non-ASCII characters, you will not be able to use your installer on a computer using a different ANSI code page than the code page of the computer used to build the installer... In the best case, you will just get mojibake.

If we cannot modify wininst-xx.exe, an alternative to be able to generate installers on non-Windows platforms is to use the most common ANSI code page (cp1252?), or maybe ASCII.

Use the ASCII encoding is the safest solution because you will be able to use your installer on all Windows setup (all ANSI code pages are compatible with ASCII), but you will not be able to generate an installer if the package metadata contains at least one non-ASCII character...

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10945>
_______________________________________


More information about the Python-bugs-list mailing list