[stdlib-sig] Breaking out the stdlib
jnoller at gmail.com
Mon Sep 14 17:13:12 CEST 2009
Note, since I drafted this, brett's posted some thought on evolution
as well: http://sayspy.blogspot.com/2009/09/evolving-standard-library.html
So, here's a small pile of thoughts - the main driving force of which
was the common sentiment that was shown in the language summit at last
pycon. I'm mainly sending this out to evoke some discussion.
Last pycon, many of us agreed that the stdlib needed to be "broken"
out of the core within source control. AFAIK, this is still pending
the mercurial migration. The goal of this would be to have one common
stdlib shared amongst the implementations (Jython, Ironpython, etc).
Modules which were cpython-only (such as multiprocessing) would be
kept near-core, or marked as Cpython only.
This means we would have an interpreter/builtins section for cpython,
one for Jython, etc while they could all consume the central stdlib
In thinking about this even more over the past year(ish) - I've
wondered if the stdlib, and python-core should actually be *really*
separated. I'm a huge fan of batteries included, but I can see the
draw to a breakdown like this:
python-core (no stdlib, interpreter, core language)
python-stdlib (no core)
python-full (the works)
(Note this may actually *help* OS package maintainers.)
Doing this - in my mind - lends itself to the stdlib evolving faster.
Sure, there's a lot of good things in the stdlib, but frankly, we have
over 216 packages in it, and only 113 developers
(http://www.python.org/dev/committers). Of those 113 developers, how
many are actually active? How many of the modules in the stdlib
actually have owners with vested interest in maintaining them? How
many of them are evolutionary dead ends? From a packaging standpoint -
it's a lot easier to spin a new stdlib package and get it into an OS
upstream then the entire interpreter.
The running joke is that the stdlib is where modules go to die -
personally, I don't think this should be true - although it is true to
a certain extent. It's also true that some of the modules within the
stdlib are not best-of-breed - httplib2 vs. urllib/httplib comes to
mind (mainly because I'm dealing with that right now).
We all know that the stdlib has evolved over a great deal of time -
and over time the quality bar has changed (for the better) - but I'd
ask how to we take that quality bar and beat some of the
packages/modules we have in there with it? How do we make sure that
the stdlib is stable, best of breed and high quality?
I would say that it's entirely possible that some things simply need
to be removed; not just platform specific things, but things which
don't have maintainers on the hook to review their patches, things
which have low-to-no test coverage/docs.
I would personally like to see every single stdlib library have an
"owner" - I know, that's a long shot, but I really feel it's needed.
Otherwise you potentially have people reviewing patches for code they
may not fully understand, or not understand the ramifications of.
Breaking out is 1/2 the fight - the other half is cleaning it up/out -
removing things with low test coverage, poor docs or things which
simply are *not* the best option. We need to take a really hard look
at things in there and ask ourselves if the things in there meet our
collective quality bar, and if they don't: remove it. We want a good,
solid and shared-amongst implementations stdlib.
More information about the stdlib-sig