Python 3 is killing Python
steve+comp.lang.python at pearwood.info
Tue Jul 15 21:06:23 CEST 2014
On Tue, 15 Jul 2014 20:38:40 +0300, Marko Rauhamaa wrote:
> Python 2 has always had unicode strings and [byte] strings. They were
> always clearly distinguished. You really didn't have to change anything
> for "true Unicode support".
If that were true, then migrating from Python 2 to 3 would be much
simpler than it is.
Unicode strings in Python 2 are second class entities. It's not just that
people will, in general, take the lazy way and write "foo" instead of
u"foo" for their strings. But it is that the whole Python virtual machine
is based on byte-strings, not Unicode strings, and u"" strings are bolted
[steve at ando ~]$ python3.3 -c "π = 3.14; print(π+1)"
[steve at ando ~]$ python2.7 -c "π = 3.14; print(π+1)"
File "<string>", line 1
π = 3.14; print(π+1)
SyntaxError: invalid syntax
Python 2 "helpfully" tries to guess what you want when you work with
bytes-pretending-to-be-strings, and when it guesses right, it's nice, but
when it guesses wrongly, you'll left with mysterious encoding and
decoding errors from code that don't appear to involve either. The whole
thing is a mess.
More information about the Python-list