On 5 March 2017 at 17:50, Nick Coghlan email@example.com wrote:
Late last year I started working on a change to the CPython CLI (*not* the shared library) to get it to coerce the legacy C locale to something based on UTF-8 when a suitable locale is available.
After a couple of rounds of iteration on linux-sig and python-ideas, I'm now bringing it to python-dev as a concrete proposal for Python 3.7.
For most folks, reading the Abstract plus the draft docs updates in the reference implementation will tell you everything you need to know (if the C.UTF-8, C.utf8 or UTF-8 locales are available, the CLI will automatically attempt to coerce the legacy C locale to one of those rather than persisting with the latter's default assumption of ASCII as the preferred text encoding).
I've just pushed a significant update to the PEP based on the discussions in this thread: https://github.com/python/peps/commit/2fb53e7c1bbb04e1321bca11cc0112aec69f63...
The main change at the technical level is to modify the handling of the coercion target locales such that they *always* lead to "surrogateescape" being used by default on the standard streams. That means we don't need to call "Py_SetStandardStreamEncoding" during startup, that subprocesses will behave the same way as their parent processes, and that Python in Linux containers will behave consistently regardless of whether the container locale is set to "C.UTF-8" explicitly, or is set to "C" and then coerced to "C.UTF-8" by CPython.
That change also eliminated the behaviour that was contingent on whether or not PEP 540 was accepted - PEP 540 may still want to have the coercion target locales imply full UTF-8 mode rather than just setting the stream error handler differently, but that will be a question to be considered when reviewing PEP 540 rather than needing to worry about it now.
The second technical change is that the locale coercion and warning are now enabled on Android and Mac OS X. For Android, that's a matter of getting GNU readline to behave sensibly, while for Mac OS X, it's a matter of simplifying the implementation and improving cross-platform behavioural consistency (even though we don't expect the coercion to actually have much impact there).
Beyond that, the PEP update focuses on clarifying a few other points without actually changing the proposal.