Python 3 is killing Python

Steven D'Aprano steve+comp.lang.python at
Tue Jul 15 21:06:23 CEST 2014

On Tue, 15 Jul 2014 20:38:40 +0300, Marko Rauhamaa wrote:

> Python 2 has always had unicode strings and [byte] strings. They were
> always clearly distinguished. You really didn't have to change anything
> for "true Unicode support".

If that were true, then migrating from Python 2 to 3 would be much 
simpler than it is.

Unicode strings in Python 2 are second class entities. It's not just that 
people will, in general, take the lazy way and write "foo" instead of 
u"foo" for their strings. But it is that the whole Python virtual machine 
is based on byte-strings, not Unicode strings, and u"" strings are bolted 
on top.

[steve at ando ~]$ python3.3 -c "π = 3.14; print(π+1)"
[steve at ando ~]$ python2.7 -c "π = 3.14; print(π+1)"
  File "<string>", line 1
    π = 3.14; print(π+1)
SyntaxError: invalid syntax

Python 2 "helpfully" tries to guess what you want when you work with 
bytes-pretending-to-be-strings, and when it guesses right, it's nice, but 
when it guesses wrongly, you'll left with mysterious encoding and 
decoding errors from code that don't appear to involve either. The whole 
thing is a mess.


More information about the Python-list mailing list