[Python-Dev] Low-Level Encoding Behavior on Python 3
Stefan Behnel
stefan_ml at behnel.de
Wed Mar 16 17:11:21 CET 2011
Armin Ronacher, 16.03.2011 16:57:
> On 3/16/11 3:48 AM, Antoine Pitrou wrote:
>> I may be mistaken, but you seem to conflate two things: encoding of
>> file names, and encoding of file contents. I guess that virtualenv
>> chokes on the file contents, but most of your argument seems related to
>> encoding of file names (aka "filesystem encoding").
> These are two pretty unrelated problems but both are problems nonetheless.
> The filename encoding should not be guessed from the environment variables
> as those are from the connecting client. The default encoding for file
> contents also should not be platform dependent. It *will* lead to people
> thinking it works when in practice it will break if they move their code to
> a remote server and SSH into it and then trigger the code execution.
>
> I argue that the first is just wrong (filename encoding guessing) and the
> latter is dangerous (file content encoding being platform dependent).
Antoine was arguing that it's not the fault of CPython that virtualenv
expects it to correctly guess the encoding of a file it wants to read. It
tries an educated guess based on the current environment setup, and if
that's not correctly configured, it's the user's fault. As you indicated
yourself, it does work most of the time. That's all you should expect from
a default.
> virtualenv itself is already fixed and explicitly tells it to read with
> UTF-8 encoding.
That's the right way to deal with encoded file content.
Stefan
More information about the Python-Dev
mailing list