[issue8514] Create fsencode() and fsdecode() functions in os.path
report at bugs.python.org
Mon Apr 26 14:00:17 CEST 2010
STINNER Victor <victor.stinner at haypocalc.com> added the comment:
Le lundi 26 avril 2010 13:06:48, vous avez écrit :
> I don't see what environment variables have to do with the file
A POSIX system only offers *one* function about the encoding:
nl_langinfo(CODESET) and Python3 uses it for the filenames, environment
variables and the command line arguments.
Are you suggesting that Python3 should support a encoding different for
environment variables and the file system? How would the user configure it?
About filenames, Python3 choose the encoding using the locale, but the user
cannot change it: sys.setfilesystemencoding() is removed by the site module.
> Also note that "mbcs" on Windows is a meta-encoding. The
> implementation of that encoding depends on the locale used by
> the Windows user. It's just a coincidence that this may actually
> work for the environment variables on Windows as well, but there's
> no guarantee.
os.getenv() should raise a TypeError on Windows if key is a byte string.
os.getenv() didn't support byte string. I patched it to support byte string
(issue #8391, r80421). But I don't like my fix because we should reject
support byte string *on Windows*. I would like to factorize the type check for
all operations on the file system and environment variables in
> On Unix, you often have the case that the environment variables
> use mixed encodings, e.g. the CGI interface is a good example
> where this happens per definition. The CGI environment can
> includes file system paths, data encoded in Latin-1 (or some
> other encoding), etc.
Since Python3 choosed to store environment variables as unicode string on
Windows and POSIX, in this specific case you should reconvert the value to
byte strings using fsencode() and then manipulate byte strings. Because
Python3 uses surrogateescape, you will get the original byte string values.
My patch should help both cases: people using unicode objects and people using
the native OS type (bytes on POSIX). As written in my previous message, you
can still use byte strings if you want. My patch doesn't change that (on POSIX
title: Create fs_encode() and fs_decode() functions in os.path -> Create fsencode() and fsdecode() functions in os.path
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list