[stdlib-sig] Python-the-platform vs Python-the-framework

Orestis Markou orestis at orestis.gr
Wed Sep 16 17:35:36 CEST 2009


The recent discussions on breaking the standard library made something  
click for me. I wrote a long blog post about it, which I am pasting  
here. The original is at http://orestis.gr/blog/2009/09/16/backwards-compatibility-straw-man/

Hopefully it will help the discussion go forward, and perhaps help the  
people who are writing PEPs.

TL;DR was TL;DR: By acknowledging the different ways Python is used,  
we will be at a better position to evolve it.

TL;DR version: Python can be separated into two parts: Python-the- 
platform and Python-the-framework. The former is mainly the language  
and some core services, the latter is the "batteries". Some people  
code against frameworks that bypass the batteries, some code against  
the batteries themselves. While it's important that the batteries are  
there and relatively stable, since they are mostly domain-specific,  
they are superseded by newer, better and more powerful ones every once  
in a while. Python-core are experts on the language, they are not on  
the various domains. We should let domain experts develop solutions  
for their domains outside the stdlib, and not confuse users by keeping  
outdated and cranky modules in the stdlib.


Original:

== Backwards compatibility is a straw man==

I recently signed up to stdlib-sig so I could just nod in agreement to  
the people that suggested that the stdlib needs to evolve. In the  
discussions that ensued, the backwards compatibility argument came up  
often. I think it's not a valid argument for the specific discussion,  
though. Here are my thoughts.

## What is stdlib?

The standard library is a set of packages and modules that get shipped  
by default with the Python interpreter.  Despite what I thought when I  
was starting with Python, it doesn't represent the so-called "best- 
practices" modules. There is no overarching design that dictates why  
things are the way they are. Instead, it just represents a set of  
modules that have been picked up at some point in time. Some of them  
are close to the language (such as collections, itertools and others)  
and some are domain-specific tools. The difference is significant.

Here is an attempt to divide the ones visible at [the docs](http://docs.python.org/library/ 
). Of course, I'm choosing here based on personal preference, YMMV.  
Here we go:

### Language

String Services:

 > `string, re, struct, StringIO, cStringIO, codecs, unicodedata`

Data types:

 > `datetime, collections, heapq, bisect, array, sets, sched, mutex,  
queue, weakref, UserDict, UserList, UserString, types, new, copy,  
pprint, repr`

Numeric and Mathematical Modules:

 > `numbers, math, cmath, decimal, fractions, random, itertools,  
functools, operator`

(why itertools, functools and operator are here is beyond me)

File and Directory Access:

 > `os.path, stat, statvfs, filecmp, tempfile, glob, fnmatch, shutil`

Data Persistence:

 > `pickle, cPickle, copy_reg, shelve, marshal`

Data Compression and Archiving:

 > `zlib, gzip, bz2, zipfile, tarfile`

* this category can argued to be both domain-speficic and close to  
language.

Cryptographic Services:

 > `hashlib, hmac, md5, sha`

Generic Operating System Services:

 > `os, io, time, getpass, platform, errno, ctypes`

Optional Operating System Services:

 > `select, threading, thread, dummy_threading, dummy_thread,  
multiprocessing, mmap`

Interprocess Communication and Networking:

 > `subprocess, socket, ssl, signal, popen2, asyncore, asynchat`

  * asyncore and asynchat can be said to be domain specific, but async  
io is fundamental IMO

Internet Data Handling:

 > `base64, binhex, binascii, uu`

  * these should probably live alongside codecs

Internet Protocols and Support:

 > `wsgiref, uuid`

  * wsgi is meant as an interop protocol, so I put it close to the  
language.

Internationalization:

 > `gettext, locale`

Development Tools:

 > `pydoc, doctest, unittest, 2to2, test`

Debugging and Profiling

 > `bdb, pdb, hotshot, timeit, trace`

Python Runtime Services

 > `sys, __builtin__, future_builtins, __main__, warnings, contextlib,  
abc, atexit, traceback, __future__, gc, inspect, site, user, fpectl`

Importing Modules:

 > `imp, imputil, zipimport, pkgutil, modulefinder, runpy`

Python Language Services:

 > `parser, ast, symtable, symbol, token, keyword, tokenize, tabnanny,  
py_compile, compileall, dis, pickletools, distutils, pyclbr, compiler`

  * a lot of these arguably are domain-speficic, but given the domain  
is Python...
  * the compiler package is included here as well




### Domain specific

String Services:

 > `difflib, textwrap, stringprep, fpformat`

Data types:

 > `calendar`

File and Directory Access:

 > `fileinput, linecache, dircache, macpath`

Data Persistence:

 > `anydbm, whichdbm, dbm, gdbm, dbhash, bsddb, dubmdbm, sqlite3`

File Formats:

 > `csv, ConfigParser, robotparser, nterc, xdrlib, plistlib`

Generic Operating System Services:

 > `optparse, getopt, logging, curses, curses.*`

Optional Operating System Services:

 > `readline, rlcompleter`

Internet Data Handling:

 > `email, json, mailcap, mailbox, mhlib, mimetools, mimetypes,  
MimeWriter, mimify, multifile, rfc822, quopri`

Structured Markup Processing Tools:

 > `HTMLParser, sgmllib, htmllib, htmlentitydefs, xml.*`

Internet Protocols and Support:

 > `webbrowser, cgi, cgitb, urllib, urllib2, httplib, ftplib, poplib,  
imaplib, nntplib, smtplib, smtpd, telnetlib, urlparse, SocketServer,  
BaseHTTPServer, SimpleHTTPServer, CGIHTTPServer, cookielib, Cookie,  
xmlrpclib, SimpleXMLRPCServer, DocXMLRPCServer`

Multimedia Services:

 > `audioop, imageop, aifc, sunau, wave, chunk, colorsys, imghdr,  
sndhdr, ossaudiodev`

Program Frameworks

 > `cmd, shlex`

GUI with Tk:

 > `Tkinter, Tix, ScrolledText, turtle, IDLE, Others`

Custom Python Interpreters:

 > `code, codeop`

Restricted Execution:

 > `rexec, Bastion`

  * Both have been removed from Python 3.0

Miscellaneous Services:

 > `formatter`

MS Windows Specific Services:

 > `msilib, msvcrt, _winreg, winsound`

Unix Specific Services:

 > `posix, pwd, spwd, grp, crypt, dl, termios, tty, pty, fcntl, pipes,  
posixfile, resource, nis, syslog, commands`

Mac OS X specific services:

 > `ic, MacOS, macostools, findertools, EasyDialogs, Framework,  
autoGIL, ColorPicker`

MacPython OSA Modules:

 > `gensuitemodule, aetools, aepack, aetypes, MiniAEFrame`

SGI IRIX Specific Services:

 > `al, AL, cd, dl, DL, flp, fm, gl, DEVICE, GL, imgfile, jpeg`

SunOS Specific Services:

 > `sunaudiodev, SUNAUDIODEV`

### GRAND TOTAL

130 language-related

151 domain-specific

Damn isn't that a lot of packages. For reference, PyPI currently hosts  
~7500 of them. Truly, Python has a lot of batteries.

## A platform, or a framework?

I can easily see a neat split there - the first half is Python, the  
platform. The second half are the batteries. However nowadays the  
batteries are not enough. While you may be able to write a quick and  
dirty script with them, if you're doing web stuff you're probably  
using another framework, if you're doing desktop stuff you're probably  
using another toolkit as well. Of course, there are other uses I  
probably don't know nothing about, and for them Python *becomes* the  
framework.

I know that many frameworks built on top of Python-the-platform start  
with the batteries, and then they start writing their own  
implementations to fix bugs or add features. Django-the-framework runs  
on Python-the-platform 2.3-2.6 so it can't rely on features being  
present or bugs fixed in the batteries - it has its own.

## Backwards compatible

My issue with the backwards compatibility argument is this: No one  
forces anyone to update to any version of Python. Developers make a  
conscious decision - to develop software for a specific (or a range  
of) version of Python, and specific versions for all the other  
libraries they depend on. *Any* change to the dependencies of a piece  
of software may lead to breakage. I see no reason why Python should be  
different for that purpose.

I can't see backwards compatibility as an argument against upgrading  
Python, adding features, deprecating and removing modules, and of  
course fixing bugs. (Aside: Microsoft is so backwards compatible so as  
to emulate bugs if important programs need it. We don't want to do  
that!). Instead, I see backwards compatibility as an argument *for*  
better isolation of Python-the-framework. If a program needs specific  
versions of Python and libraries, it should be trivial to guard them  
against change. If an operating system depends on a specific version  
of Python, it should hide it away and not allow modifications.

On the other hand, I would argue against radical changes to Python-the- 
platform. Of course, this has been the case so far, with one exception  
in Python 3.0 to fix issues that needed to be fixed. In fact, there's  
a nice forwards-compatible feature for changes to the platform -  
`__future__`. People have been upgrading to new Python features with  
minor complaints, so I don't see why changing the batteries part of  
stdlib is tickling people so much.

## Best of breed

When I started with Python, I only used modules from stdlib - I had no  
idea about PyPI, and I assumed that things from python-core would be  
more high-quality. However, this is only true for the language  
modules, not the domain specific modules. The reason is simple -  
python-core are experts on Python and language design, but not experts  
on the numerous domains the batteries cover. There are now  
replacements for most, if not all (os-specific stuff probably  
excluded) domain-specific modules. People trying to get a GUI running  
with Tk and not knowing about wx, Qt, Gtk, or the platform-specific  
choices is bad. People trying to do image manipulation and not knowing  
about PIL is bad.

I would argue that domain-specific parts should be spun off the stdlib  
and be released as separate PyPI modules. We can keep Python-the- 
framework going by having a download with the kitchen sink provided  
(as [Jesse Noller proposed](http://mail.python.org/pipermail/stdlib-sig/2009-September/000398.html 
)), and cooperate with packagers/distributions so that they can  
fortify their installations against change.

## Conclusion

The argument on stdlib-sig is huge, and thankfully it seems that  
something is getting done in the end. I expect a some people to agree  
with me, and some to disagree. Writing my thoughts makes me think, so  
please keep in mind that I am willing to be persuaded otherwise, with  
the correct arguments.

As far as my day to day use is concerned, 99% of the batteries could  
disappear from my site-packages, and I would not care. Of course,  
packages I actually use and import (twisted, pyobjc) _would_ care  
(actually, both of those will most likely use their own batteries). I  
wonder for how many people is this situation familiar. Find your  
imports, and see what the results are.






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/stdlib-sig/attachments/20090916/69125133/attachment-0001.htm>


More information about the stdlib-sig mailing list