[stdlib-sig] Python-the-platform vs Python-the-framework
Orestis Markou
orestis at orestis.gr
Wed Sep 16 17:35:36 CEST 2009
The recent discussions on breaking the standard library made something
click for me. I wrote a long blog post about it, which I am pasting
here. The original is at http://orestis.gr/blog/2009/09/16/backwards-compatibility-straw-man/
Hopefully it will help the discussion go forward, and perhaps help the
people who are writing PEPs.
TL;DR was TL;DR: By acknowledging the different ways Python is used,
we will be at a better position to evolve it.
TL;DR version: Python can be separated into two parts: Python-the-
platform and Python-the-framework. The former is mainly the language
and some core services, the latter is the "batteries". Some people
code against frameworks that bypass the batteries, some code against
the batteries themselves. While it's important that the batteries are
there and relatively stable, since they are mostly domain-specific,
they are superseded by newer, better and more powerful ones every once
in a while. Python-core are experts on the language, they are not on
the various domains. We should let domain experts develop solutions
for their domains outside the stdlib, and not confuse users by keeping
outdated and cranky modules in the stdlib.
Original:
== Backwards compatibility is a straw man==
I recently signed up to stdlib-sig so I could just nod in agreement to
the people that suggested that the stdlib needs to evolve. In the
discussions that ensued, the backwards compatibility argument came up
often. I think it's not a valid argument for the specific discussion,
though. Here are my thoughts.
## What is stdlib?
The standard library is a set of packages and modules that get shipped
by default with the Python interpreter. Despite what I thought when I
was starting with Python, it doesn't represent the so-called "best-
practices" modules. There is no overarching design that dictates why
things are the way they are. Instead, it just represents a set of
modules that have been picked up at some point in time. Some of them
are close to the language (such as collections, itertools and others)
and some are domain-specific tools. The difference is significant.
Here is an attempt to divide the ones visible at [the docs](http://docs.python.org/library/
). Of course, I'm choosing here based on personal preference, YMMV.
Here we go:
### Language
String Services:
> `string, re, struct, StringIO, cStringIO, codecs, unicodedata`
Data types:
> `datetime, collections, heapq, bisect, array, sets, sched, mutex,
queue, weakref, UserDict, UserList, UserString, types, new, copy,
pprint, repr`
Numeric and Mathematical Modules:
> `numbers, math, cmath, decimal, fractions, random, itertools,
functools, operator`
(why itertools, functools and operator are here is beyond me)
File and Directory Access:
> `os.path, stat, statvfs, filecmp, tempfile, glob, fnmatch, shutil`
Data Persistence:
> `pickle, cPickle, copy_reg, shelve, marshal`
Data Compression and Archiving:
> `zlib, gzip, bz2, zipfile, tarfile`
* this category can argued to be both domain-speficic and close to
language.
Cryptographic Services:
> `hashlib, hmac, md5, sha`
Generic Operating System Services:
> `os, io, time, getpass, platform, errno, ctypes`
Optional Operating System Services:
> `select, threading, thread, dummy_threading, dummy_thread,
multiprocessing, mmap`
Interprocess Communication and Networking:
> `subprocess, socket, ssl, signal, popen2, asyncore, asynchat`
* asyncore and asynchat can be said to be domain specific, but async
io is fundamental IMO
Internet Data Handling:
> `base64, binhex, binascii, uu`
* these should probably live alongside codecs
Internet Protocols and Support:
> `wsgiref, uuid`
* wsgi is meant as an interop protocol, so I put it close to the
language.
Internationalization:
> `gettext, locale`
Development Tools:
> `pydoc, doctest, unittest, 2to2, test`
Debugging and Profiling
> `bdb, pdb, hotshot, timeit, trace`
Python Runtime Services
> `sys, __builtin__, future_builtins, __main__, warnings, contextlib,
abc, atexit, traceback, __future__, gc, inspect, site, user, fpectl`
Importing Modules:
> `imp, imputil, zipimport, pkgutil, modulefinder, runpy`
Python Language Services:
> `parser, ast, symtable, symbol, token, keyword, tokenize, tabnanny,
py_compile, compileall, dis, pickletools, distutils, pyclbr, compiler`
* a lot of these arguably are domain-speficic, but given the domain
is Python...
* the compiler package is included here as well
### Domain specific
String Services:
> `difflib, textwrap, stringprep, fpformat`
Data types:
> `calendar`
File and Directory Access:
> `fileinput, linecache, dircache, macpath`
Data Persistence:
> `anydbm, whichdbm, dbm, gdbm, dbhash, bsddb, dubmdbm, sqlite3`
File Formats:
> `csv, ConfigParser, robotparser, nterc, xdrlib, plistlib`
Generic Operating System Services:
> `optparse, getopt, logging, curses, curses.*`
Optional Operating System Services:
> `readline, rlcompleter`
Internet Data Handling:
> `email, json, mailcap, mailbox, mhlib, mimetools, mimetypes,
MimeWriter, mimify, multifile, rfc822, quopri`
Structured Markup Processing Tools:
> `HTMLParser, sgmllib, htmllib, htmlentitydefs, xml.*`
Internet Protocols and Support:
> `webbrowser, cgi, cgitb, urllib, urllib2, httplib, ftplib, poplib,
imaplib, nntplib, smtplib, smtpd, telnetlib, urlparse, SocketServer,
BaseHTTPServer, SimpleHTTPServer, CGIHTTPServer, cookielib, Cookie,
xmlrpclib, SimpleXMLRPCServer, DocXMLRPCServer`
Multimedia Services:
> `audioop, imageop, aifc, sunau, wave, chunk, colorsys, imghdr,
sndhdr, ossaudiodev`
Program Frameworks
> `cmd, shlex`
GUI with Tk:
> `Tkinter, Tix, ScrolledText, turtle, IDLE, Others`
Custom Python Interpreters:
> `code, codeop`
Restricted Execution:
> `rexec, Bastion`
* Both have been removed from Python 3.0
Miscellaneous Services:
> `formatter`
MS Windows Specific Services:
> `msilib, msvcrt, _winreg, winsound`
Unix Specific Services:
> `posix, pwd, spwd, grp, crypt, dl, termios, tty, pty, fcntl, pipes,
posixfile, resource, nis, syslog, commands`
Mac OS X specific services:
> `ic, MacOS, macostools, findertools, EasyDialogs, Framework,
autoGIL, ColorPicker`
MacPython OSA Modules:
> `gensuitemodule, aetools, aepack, aetypes, MiniAEFrame`
SGI IRIX Specific Services:
> `al, AL, cd, dl, DL, flp, fm, gl, DEVICE, GL, imgfile, jpeg`
SunOS Specific Services:
> `sunaudiodev, SUNAUDIODEV`
### GRAND TOTAL
130 language-related
151 domain-specific
Damn isn't that a lot of packages. For reference, PyPI currently hosts
~7500 of them. Truly, Python has a lot of batteries.
## A platform, or a framework?
I can easily see a neat split there - the first half is Python, the
platform. The second half are the batteries. However nowadays the
batteries are not enough. While you may be able to write a quick and
dirty script with them, if you're doing web stuff you're probably
using another framework, if you're doing desktop stuff you're probably
using another toolkit as well. Of course, there are other uses I
probably don't know nothing about, and for them Python *becomes* the
framework.
I know that many frameworks built on top of Python-the-platform start
with the batteries, and then they start writing their own
implementations to fix bugs or add features. Django-the-framework runs
on Python-the-platform 2.3-2.6 so it can't rely on features being
present or bugs fixed in the batteries - it has its own.
## Backwards compatible
My issue with the backwards compatibility argument is this: No one
forces anyone to update to any version of Python. Developers make a
conscious decision - to develop software for a specific (or a range
of) version of Python, and specific versions for all the other
libraries they depend on. *Any* change to the dependencies of a piece
of software may lead to breakage. I see no reason why Python should be
different for that purpose.
I can't see backwards compatibility as an argument against upgrading
Python, adding features, deprecating and removing modules, and of
course fixing bugs. (Aside: Microsoft is so backwards compatible so as
to emulate bugs if important programs need it. We don't want to do
that!). Instead, I see backwards compatibility as an argument *for*
better isolation of Python-the-framework. If a program needs specific
versions of Python and libraries, it should be trivial to guard them
against change. If an operating system depends on a specific version
of Python, it should hide it away and not allow modifications.
On the other hand, I would argue against radical changes to Python-the-
platform. Of course, this has been the case so far, with one exception
in Python 3.0 to fix issues that needed to be fixed. In fact, there's
a nice forwards-compatible feature for changes to the platform -
`__future__`. People have been upgrading to new Python features with
minor complaints, so I don't see why changing the batteries part of
stdlib is tickling people so much.
## Best of breed
When I started with Python, I only used modules from stdlib - I had no
idea about PyPI, and I assumed that things from python-core would be
more high-quality. However, this is only true for the language
modules, not the domain specific modules. The reason is simple -
python-core are experts on Python and language design, but not experts
on the numerous domains the batteries cover. There are now
replacements for most, if not all (os-specific stuff probably
excluded) domain-specific modules. People trying to get a GUI running
with Tk and not knowing about wx, Qt, Gtk, or the platform-specific
choices is bad. People trying to do image manipulation and not knowing
about PIL is bad.
I would argue that domain-specific parts should be spun off the stdlib
and be released as separate PyPI modules. We can keep Python-the-
framework going by having a download with the kitchen sink provided
(as [Jesse Noller proposed](http://mail.python.org/pipermail/stdlib-sig/2009-September/000398.html
)), and cooperate with packagers/distributions so that they can
fortify their installations against change.
## Conclusion
The argument on stdlib-sig is huge, and thankfully it seems that
something is getting done in the end. I expect a some people to agree
with me, and some to disagree. Writing my thoughts makes me think, so
please keep in mind that I am willing to be persuaded otherwise, with
the correct arguments.
As far as my day to day use is concerned, 99% of the batteries could
disappear from my site-packages, and I would not care. Of course,
packages I actually use and import (twisted, pyobjc) _would_ care
(actually, both of those will most likely use their own batteries). I
wonder for how many people is this situation familiar. Find your
imports, and see what the results are.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/stdlib-sig/attachments/20090916/69125133/attachment-0001.htm>
More information about the stdlib-sig
mailing list