[Python-Dev] Bytes path support

Nick Coghlan ncoghlan at gmail.com
Thu Aug 21 14:55:33 CEST 2014


On 21 August 2014 14:52, Cameron Simpson <cs at zip.com.au> wrote:
>
> Oh, and I reject Nick's characterisation of POSIX as "broken". It's
> perfectly internally consistent. It just doesn't match what he wants.
> (Indeed, what I want, and I'm a long time UNIX fanboy.)

The part that is broken is the idea that locale encodings are a viable
solution to conveying the appropriate encoding to use to talk to the
operating system. We've tried trusting them with Python 3, and they're
reliably wrong in certain situations. systemd is apparently better
than upstart at setting them correctly (e.g. for cron jobs), but even
it can't defend against an erroneous (or deliberate!) "LANG=C", or ssh
environment forwarding pushing a client's locale to the server. It's
worth looking through some of Armin Ronacher's complaints about Python
3 being broken on Linux, and seeing how many of them boil down to
"trusting the locale is wrong, Python 3 should just assume UTF-8 on
every POSIX system, the same way it does on Mac OS X". (I suspect
ShiftJIS, ISO-2022, et al users might object to that approach, but
it's at least a more viable choice now than it was back in 2008)

I still think we made the right call at least *trying* the idea of
trusting the locale encoding (since that's the officially supported
way of getting this information from the OS), and in many, many
situations it works fine. But I suspect we may eventually need to
resolve the technical issues currently preventing us from deciding to
ignore the environmental locale during interpreter startup and try
something different (such as always assuming UTF-8, or trying to force
C.UTF-8 if we detect the C locale, or looking for the systemd config
files and using those to set the OS encoding, rather than the
environmental locale).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list