[Python-Dev] Python-3.0, unicode, and os.environ

Toshio Kuratomi a.badger at gmail.com
Fri Dec 5 23:21:50 CET 2008


Victor Stinner wrote:
>>> It would be maybe easier if os.environ supports bytes and unicode keys.
>>> But we have to keep these assertions:
>>>    os.environ[bytes] -> bytes
>>>    os.environ[str] -> str
>> I think the same choices have to be made here.  If LANG=C, we still have
>> to decide what to do when os.environ[str] is set to a non-ASCii string.
> 
> If the charset is US-ASCII, os.environ will drop non-ASCII values. But most 
> variables are ASCII only. Examples with my shell:
> 
Yes.  But you still have the question of what to do when:
os.environ[str] = chr(0x10000)

So I don't think it makes things simpler than having separate os.environ
and os.environb that update the same data behind the scenes.

>> Additionally, the subprocess question makes using the key value
>> undesirable compared with having a separate os.environb that accesses
>> the same underlying data.
> 
> The user should be able to choose bytes or unicode. Examples:

the subprocess question was posed further up the thread as basically --
does the user need to access os.environb in order to override things in
the environment when calling subprocess?  I think the answer to that is
yes since you might want to start with your environment and modify it
slightly when you call programs via subprocess.  If you just try to copy
os.environ and os.environ only iterates through the decodable env vars,
that doesn't work.  If you have an os.environb to copy it becomes possible.

>  - subprocess.Popen('ls') => use unicode environment (os.environ)
>  - subprocess.Popen(b'ls') => use bytes environment (os.environb)
> 
That's... not expected to me :-(

If I never touch os.environ and invoke subprocess the normal way, I'd
still expect the whole environment to be passed on to the program being
called.  This is how invoking programs manually, shell scripting,
invoking programs from perl, python2, etc work.

Also, it's not really a good fit with the other things that key off of
the initial argument.  os.listdir(b'.') changes the output to bytes.
subprocess.Popen(b'ls') would change what environment gets input into
the call.

>> Here's my problem with it, though.  With these semantics any program
>> that works on arbitrary files and runs on *NIX has to check
>> os.listdir(b'') and do the conversion manually.
> 
> Only programs that have to support strange environment like yours (mixing 
> Shift-JIS and UTF-8) :-) Most programs don't have to support these charset 
> mixture.
> 
Any program that is intended to be distributed, accesses arbitrary
files, and works on *nix platforms needs to take this into account.
Just because the environment inside of my organization is sane doesn't
mean that when we release the code to customers, clients, or the free
software community that the places it runs will be as strict about these
things.

Are most programs specific to one organization or are they distributed
to other people?  I can't answer that... everything I work on (except
passwords:-) is distributed -- from sys admin cronjobs to web
applications since I'm lucky that my whole job is devoted to working on
free software.

-Toshio

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-dev/attachments/20081205/da33d279/attachment.pgp>


More information about the Python-Dev mailing list