[issue19846] Python 3 raises Unicode errors with the C locale

STINNER Victor report at bugs.python.org
Tue Dec 10 21:27:49 CET 2013


STINNER Victor added the comment:

2013/12/10 Toshio Kuratomi <report at bugs.python.org>:
> if G_FILENAME_ENCODING:
>     charset = the first charset listed in G_FILENAME_ENCODING
>     if charset == '@locale':
>         charset = charset of user's locale
> elif G_BROKEN_FILENAMES:
>     charset = charset of user's locale
> else:
>     charset = 'UTF-8'

g_get_filename_charsets() returns a list of encodings. For the last
case (else:), it uses ['utf-8', local_encoding] on UNIX. It's reliable
because the utf-8 encoding has a nice feature, the utf-8 decoder fails
if the byte string is not a valid utf-8 string.

It would interesting to test this approach (try utf-8 or use the
locale encoding) in
PyUnicode_DecodeFSDefault/PyUnicode_EncodeFSDefault and
_Py_char2wchar/_Py_wchar2char.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19846>
_______________________________________


More information about the Python-bugs-list mailing list