[issue28180] sys.getfilesystemencoding() should default to utf-8

STINNER Victor report at bugs.python.org
Fri Jan 6 21:53:55 EST 2017


STINNER Victor added the comment:

Sworddragon added the comment:
> (for me and maybe others that is explicitly preferred but maybe this depends on each individual)

That's why the PEP 540 has options to enable to disable its UTF-8 mode(s).

> If I'm not wrong PEP 538 improves this for the output too but input handling will still suffer from the overall issue while PEP 540 does also solve this case.

The PEP 538 works fine if all inputs and outputs are encoded to UTF-8.
I understand that it's a deliberate choice to fail on
decoding/encoding error (to not use surrogateescape), but I can be
wrong.

> Also PEP 540 would not make the C locale and thus eventually some systems potentially unsupported (but it might be an acceptable trade-off if we should really go PEP 538).

What do you mean by "make the C locale"?

> Specific for PEP 540:
>
>> The POSIX locale enables the UTF-8 mode
>
> Non-strict I assume?

Yes, non strict.

I'm not sure of the name of each mode yet.

After having written the "Use Cases" section and especially the
Mojibake column of results, I consider the option of renaming the
"UTF-8 mode" to "YOLO mode".

>> UTF-8 /backslashreplace
>
> Was/is the reason to use backslashreplace for sys.stderr to guarantee that the developer/user sees the error messages?

Yes.

> Might it make sense to also use surrogateescape instead of backslashescape for sys.stderr in UTF-8 non-strict mode to be consistent here?

Using surrogateescape means that you pass through undecodable bytes
from inputs to stderr which can cause various kinds of bad surprises.

stderr is used to log errors. Getting a new error when trying to log
an error is kind of annoying.

Victor

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue28180>
_______________________________________


More information about the Python-bugs-list mailing list