[Python-Dev] Python-3.0, unicode, and os.environ

Adam Olsen rhamph at gmail.com
Fri Dec 5 06:46:14 CET 2008


On Thu, Dec 4, 2008 at 10:14 PM, Guido van Rossum <guido at python.org> wrote:
> At the risk of bringing up something that was already rejected, let me
> propose something that follows the path taken in 3.0 for filenames,
> rather than doubling back:
>
> For os.environ, os.getenv() and os.putenv(), I think a similar
> approach as used for os.listdir() and os.getcwd() makes sense: let
> os.environ skip variables whose name or value is undecodable, and have
> a separate os.environb() which contains bytes; let os.getenv() and
> os.putenv() do the right thing when the arguments passed in are bytes.

+1 (as that's what I suggested)


> For sys.argv, because it's positional, you can't skip undecodable
> values, so I propose to use error=replace for the decoding; again, we
> can add sys.argvb that contains the raw bytes values. The various
> os.exec*() and os.spawn*() calls (as well as os.system(), os.popen()
> and the subprocess module) should all accept bytes as well as strings.

+1.  I wish there was a better solution to sys.argv.


> On Windows, the bytes APIs should probably not exist.

-0.  I'd prefer byte APIs return UTF-16 bytes and the unicode APIs
become validating.


> I predict that most developers can get away with not using the bytes
> APIs at all. The small minority that needs to be robust if not all
> filenames use the system encoding can use the bytes APIs. This would
> be developers on various Unix systems except OSX (which uses UTF8 for
> its filesystems), and perhaps the occasional developer on OSX whose
> app needs to work with files on mounted filesystems that use a
> different encoding.


-- 
Adam Olsen, aka Rhamphoryncus


More information about the Python-Dev mailing list