[Python-3000] Proposed Python 3.0 schedule (bytes/unicde again)
Bill Janssen
janssen at parc.com
Tue Oct 7 17:24:08 CEST 2008
Antoine Pitrou <solipsis at pitrou.net> wrote:
> > - And then, getopt and optparse modules should work on bytestring
> > vectors, so that you can use sys.argvb without writing your own
> > argument parser. They don't currently.
>
> Then we will gradually start moving all modules even remotely related with IO
> and filesystem stuff to a dual bytes/unicode API? That's precisely the kind of
> confusion we want to end with Py3k (the confusion between bytes and unicode as
> similar data types which could be used almost interchangeably without giving any
> consideration to semantics).
I wouldn't mix "IO" and "filesystem" that way. "IO" is complicated.
The problem is, as we've lately discovered, that things which "look
toward" the machine and the OS, like file system APIs or os.getcwd() or
os.environ, are really dealing in bit sequences of various kinds, not
strings, though the designers of these low-level artifacts have made
some effort to disguise that. Things which "look toward" the user, on
the other hand, are really dealing in strings, not bytes. There's a
conversion step in there, if you are trying to write a program to print
to stdout (that is, the user) all the files in a directory (the OS).
Now, we can provide a automatic converter which will work in lots of
cases, but we can't affort to just deny the cases in which it doesn't
work. We need bytes APIs to the OS and underlying machine and
networking and probably other things; we need string APIs to communicate
with the user.
Bill
More information about the Python-3000
mailing list