Python 3 is killing Python
rosuav at gmail.com
Wed Jul 16 08:26:03 CEST 2014
On Wed, Jul 16, 2014 at 3:52 PM, Marko Rauhamaa <marko at pacujo.net> wrote:
> Steven D'Aprano <steve+comp.lang.python at pearwood.info>:
>> On Tue, 15 Jul 2014 23:01:25 +0300, Marko Rauhamaa wrote:
>>> In fact, I find the lazy use of Unicode strings at least as scary as
>>> the lazy use of byte strings, especially since Python 3 sneaks
>>> Unicode to the outer interfaces of the program (files, IPC).
>> I'm not entirely sure I understand what you mean by "lazy use of
>> Unicode strings". And I especially don't understand what you mean by
>> "sneak". The fact that strings are Unicode is *the* biggest and most
>> obvious new feature of Python 3.
> I mean that sys.stdin and sys.stdout should deal with byte strings. I
> mean that open(path) should open a file in binary mode. Thankfully, the
> subprocess methods exchange bytes by default.
Let's take a step back from the standard I/O streams and look at this
one concept: Asking the user to enter his/her name. The user will have
a name which consists of characters (at least, we hope so; cases where
this is not true do exist, but are outside the scope of this
discussion), not bytes. The program wants to use those characters, not
bytes. If I create a window with tkinter and ask the user to enter a
name, I'll get back a Unicode string:
(By the way, this suffers from the common flaw of asking for separate
first and last names. That's not reliable in terms of people's names,
but it's no different in terms of bytes and strings.)
(Also by the way, why is a Python course advertising that its web site
is written in PHP?)
Whether I use Python 2 (changing the import to Tkinter) or Python 3
(running the code unchanged), I get back a Unicode string (easily
proven by looking at its repr() in show_entry_fields()), because the
user typed *text* into the widget. This is what everyone will expect.
Now, the standard I/O streams might be connected to a console, or
might be reading from a pipe. This does add a level of complexity, as
it's possible to read either text or bytes from them; but Python
defaults to the most common case, where they're connected to a
console, and does its best to allow print() to write Unicode to any
console. (With limited success on some consoles; Windows' cmd.exe has
problems. That's not Python's fault.) If you want binary, you can
easily switch to binary mode. Maybe it would be better to have a
simple function "change standard stream(s) to binary", but the default
is still correct: most of the time, you want to work with text.
> To me, the main difference between Python 2 and Python 3 is that in the
> former, I use "..." everywhere, and in the latter, I use b"..."
> everywhere. If I should need advanced text processing features, I'll go
> through a decode() and encode().
Why do you work with the underlying representation (bytes) instead of
the abstraction (strings)? To be consistent, you should probably
eschew Python dictionaries in favour of a manually-implemented
hashtable, and studiously avoid Python source code by hand-writing
>> As of right now, *new* projects ought to be written in Python 3.3 or
>> better, unless you have a compelling reason not to. You don't have to
>> port old projects in order to take advantage of Python 3 for new
> But my distro only provides Python 3.2. What's wrong with Python 3.2?
> Why didn't anybody tell me to put off the migration?
It's pretty easy to spin up a CPython 3.4 for Debian Wheezy. Or you
might be able to just grab a package from Jessie, where CPython 3.4 is
standard. Debian, like Red Hat, values stability over currency, so
once Wheezy went into freeze on 2012-06-30 , the
current-at-the-time Python was locked in (Python 3.3 wasn't released
until 2012-09-29 ), in order to allow packages to depend on it and
be able to trust it.
(It's possible you're not on Debian Wheezy, but on some other distro
that also ships 3.2 with its current release. The policies are likely
to be similar.)
More information about the Python-list