[Python-Dev] Python-3.0, unicode, and os.environ

Nick Coghlan ncoghlan at gmail.com
Fri Dec 5 10:21:32 CET 2008


glyph at divmod.com wrote:
> At least this time I think I've encapsulated pretty much my entire
> argument here, so if you don't buy it, we can probably just agree to
> disagree :).

Glyph, the only point I would add to your message is this one:

Adding a "blessed" way to encode arbitrary binary data into a Python 3.0
str object strikes me as giving up on one of the key advances in the new
version of the language.

8-bit strings were a problem in Python 2.x because they blurred the
boundary between arbitrary binary data and ASCII or latin-1 character data.

One of the most interesting aspects of Python 3.0 is its attempt to get
developers to be explicit about this distinction (both in the code and
in their own minds) by enforcing separation between arbitrary binary
data (held in bytes and bytearray instances) and character data (held in
str instances).

I don't understand how tunneling arbitrary binary data through str
instances (*regardless* of encoding mechanism) can possibly fail to
recreate exactly the same "is it text or binary data?" ambiguity
problems that the str/bytes split is intended to eliminate. And if that
happens, then what exactly was the point in moving to an all Unicode
string model for Py3k?

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------


More information about the Python-Dev mailing list