[Python-Dev] Python 3.5 now uses surrogateescape for the POSIX locale

Nick Coghlan ncoghlan at gmail.com
Tue Mar 18 10:48:38 CET 2014


On 18 March 2014 19:13, Victor Stinner <victor.stinner at gmail.com> wrote:
> 2014-03-18 9:08 GMT+01:00 Nick Coghlan <ncoghlan at gmail.com>:
>> On 18 Mar 2014 11:56, "Victor Stinner" <victor.stinner at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I modified Python 3.5 to use the "surrogateescape" error handler (PEP
>>> 383) for stdin and stdout when the LC_CTYPE locale is POSIX ("C"
>>> locale):
>>> http://bugs.python.org/issue19977
>>
>> Yay, thanks Victor. I'll let the Fedora folks know this has been merged, as
>> we may seriously consider applying this as a vendor patch to our build of
>> Python 3.4 (while I agree this isn't a bug fix, the current behaviour also
>> poses a problem for Fedora as more core utilities start migrating to Python
>> 3).
>
> Please don't cherry-pick this change in Fedora if it is not done in
> Python 3.4. It changes the behaviour of Python and I would prefer to
> have the same behaviour on the same Python version on all platforms.
>
> I'm not against backporting the change in Python 3.4.1. It can be seen
> as a bugfix. I don't think that anyone wants a Unicode error when
> reading or printing non-ASCII data from stdin/to stdout. But I would
> like the opinion of other developers before doing that.

Well, the concern has always been the risk of silently generating bad
data if there is a mismatch between the OS encoding and the stream
encodings. That's why it took so long to make this change at all - we
had to figure out that the underlying problem was really the ease with
which even a properly configured Linux systems could end up running
Python 3 code in the POSIX locale, and thus end up with improperly
configured standard streams. Enabling "surrogateescape" by default
only when the standard stream encoding is "ascii" helps to mitigate
that risk, while still dealing with the main problem. I meant to try
to get this into 3.4 (since a couple of the Fedora folks convinced me
it was a problem), but there are only so many hours in the day, and it
took me quite a while to fully grasp the actual problem.

If folks are open to backporting this change to 3.4.1, then yes, I'd
definitely prefer an upstream solution. Otherwise, it will be up to
the Fedora Python maintainers to decide what they want to do.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list