[issue19846] print() and write() are relying on sys.getfilesystemencoding() instead of sys.getdefaultencoding()

STINNER Victor report at bugs.python.org
Sun Dec 8 12:45:22 CET 2013


STINNER Victor added the comment:

2013/12/8 Antoine Pitrou <report at bugs.python.org>:
>> Python uses the fact that the filesystem encoding is the locale
>> encoding in various places.
>
> The patch doesn't change that.

You wrote: "-> With the patch: utf-8 utf-8 utf-8 ANSI_X3.4-1968", so
os.get sys.getfilesystemencoding() != locale.getpreferredencoding().
Or said differently, the filesystem encoding is different than the
locale encoding.

So please read again my following message which list real bugs:
https://mail.python.org/pipermail/python-dev/2010-October/104509.html

If you want to use a filesystem encoding different than the locale
encoding, you have to patch Python where Python assumes that the
filesystem encoding is the locale encoding, to fix all these bugs.
Starts with:

- PyUnicode_DecodeFSDefaultAndSize()
- PyUnicode_EncodeFSDefault()
- _Py_wchar2char()
- _Py_char2wchar()

It should be easier to change this function if the FS != locale only
occurs when FS is "UTF-8". On Mac OS X, Python always use UTF-8 for
the filesystem encoding, it doesn't care of the locale encoding. See
_Py_DecodeUTF8_surrogateescape() in unicodeobject.c, you may reuse it.

With a better patch, I can do more experiment to check if they are
other tricky bugs.

Does at least your patch pass the whole test suite with LANG=C?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19846>
_______________________________________


More information about the Python-bugs-list mailing list