[stdlib-sig] standard metadata for the standard library (Was: Re: [Python-Dev] sharing stdlib across python implementations)
Kevin Teague
kevin at bud.ca
Thu Oct 1 12:17:40 CEST 2009
On Sep 30, 2009, at 7:28 AM, Chris Withers wrote:
> Frank Wierzbicki wrote:
>> Talk has started up again on the stdlib-sig list about finding a core
>> stdlib + tests that can be shared by all implementations, potentially
>> living apart from CPython. I have volunteered to put together a PEP
>> on the subject, with Jessie Noller and Brett Canon are helping me
>> out.
>> When I have something worth showing, I'll start the real PEP process.
>
> I'm on on stdlib-sig and I'm afraid I don't have the bandwidth to
> start on it, but I'd just like to throw in (yet again) that it would
> be great if the stdlib was actually a set of separate python
> packages with their own version metadata so that packaging tools
> could manage them, and upgrade them independently of python packages
> when there are bug fixes.
Amen!
Currently there are a number of penalties for package-savy developer
to use packages in the standard library, since they can't use their
normal tool chains to work with the standard library. Instead it has
to be treated as special cases.
Aside from the annoying if-else statements used to build-up
install_requires fields, a few other problems the lack of metadata for
the standard library poses:
* Install tools work differently with 3rd party packages that have
been added to the standard library. For example, simplejson and
easy_install. easy_install is not supposed to upgrade a distribution
if it's already installed unless the -U switch is supplied. However,
do an "easy_install simplejson" (no -U switch) with Python 2.6 and the
distribution is unexpectedly upgraded. This is because the metadata
has been tossed out once the distribution was incorporated to the
standard lib.
* Bug fixes are harder. If I'm working on a project which depends
upon another project, and I find a bug in that dependant project, then
the preferred route to solve that problem is contact the dependant
project's author(s) and see if they'll provide a fix and do a new
release. Then I just update the project so that its install_requries
field specifies the minimum bug-free version. If it's in the standard
library though, I file a bug report, but then instead of asking for a
release for the package in question, I instead have to put a work-
around into the project, even if the bug has been fixed, since there
is no way to specify that I just need a fix for one particular package
and that work-around needs to stay in-place in the project I was
working on until the minimum required version of Python for that
project is equal to the Python release which provides the fix. Bleh!
* What metadata that does exist about the standard library is buried
in non-standard formats and isn't programmatically accessible. The
maintainers field is stored in Misc/maintainers.rst, author and
version is stored as module attributes (__author__ and __version__).
Ideally this metadata could be collected into setup.cfg files, and
when installed would live in PEP 376 .egg-info directories, and you
would replace __version__ attributes with something such as :
import distutils
__version__ = distutils.get_distribution('packagename').version
> The big changes I can see from here would be moving the tests to the
> packages from the central tests directory, and adding a setup.py
> file or some other form of metadata providion for each package. Not
> that big now that I've written it ;-)
>
Yeah, this is what I was thinking. It doesn't sound big, until you
count up the number of packages in the standard library ... there's
more distributions in there than Zope 3! :P
However, if you are relying on Distutils to write-out the metadata,
you run into a bootstrapping issue, where you need to use the Python
interpreter you're installing to install the standard library, but the
installation requires the standard library. Maybe there are some
clever ways to solve this, by fiddling with PATHs and installing
Distutils first or something ...
But perhaps another way to solve the problem is to not use Distutils
for installation of the working set of distributions that ships with a
given release of a Python interpreter. You only need to ensure that
the end-result is the same, and comply with the .egg-info metadata
format. It really doesn't matter if a package is installed with
Distutils or not. If the metadata consumed by setup.py files is in
setup.cfg files (or perhaps some kind of .egg-info templated format
that the standard lib setup.py files read), then those files could be
munged by some shell commands, and written out as part of the makefile
during "make install". (the only tricky bits in the new .egg-info
format is computing the full-path to all installed files and computing
the MD5 hash).
Speaking of which, there is one .egg-info file in the standard library
in the old-style format ... if PEP 376 is accepted then this line in
CPython's Makefile will become a bug
@for i in $(srcdir)/Lib/*.py $(srcdir)/Lib/*.doc $(srcdir)/Lib/*.egg-
info ; \
Although wsgiref is the only project in the standard library with
metadata, so it'd be easy enough to fix this by just removing it's
metadata. But if the only package with standard metadata in the
standard library had it's metadata removed, it would make me sad :(
More information about the stdlib-sig
mailing list