[stdlib-sig] standard metadata for the standard library (Was: Re: [Python-Dev] sharing stdlib across python implementations)

Kevin Teague kevin at bud.ca
Thu Oct 1 12:17:40 CEST 2009


On Sep 30, 2009, at 7:28 AM, Chris Withers wrote:

> Frank Wierzbicki wrote:
>> Talk has started up again on the stdlib-sig list about finding a core
>> stdlib + tests that can be shared by all implementations, potentially
>> living apart from CPython.  I have volunteered to put together a PEP
>> on the subject, with Jessie Noller and Brett Canon are helping me  
>> out.
>> When I have something worth showing, I'll start the real PEP process.
>
> I'm on on stdlib-sig and I'm afraid I don't have the bandwidth to  
> start on it, but I'd just like to throw in (yet again) that it would  
> be great if the stdlib was actually a set of separate python  
> packages with their own version metadata so that packaging tools  
> could manage them, and upgrade them independently of python packages  
> when there are bug fixes.

Amen!

Currently there are a number of penalties for package-savy developer  
to use packages in the standard library, since they can't use their  
normal tool chains to work with the standard library. Instead it has  
to be treated as special cases.

Aside from the annoying if-else statements used to build-up  
install_requires fields, a few other problems the lack of metadata for  
the standard library poses:

  * Install tools work differently with 3rd party packages that have  
been added to the standard library. For example, simplejson and  
easy_install. easy_install is not supposed to upgrade a distribution  
if it's already installed unless the -U switch is supplied. However,  
do an "easy_install simplejson" (no -U switch) with Python 2.6 and the  
distribution is unexpectedly upgraded. This is because the metadata  
has been tossed out once the distribution was incorporated to the  
standard lib.

  * Bug fixes are harder. If I'm working on a project which depends  
upon another project, and I find a bug in that dependant project, then  
the preferred route to solve that problem is contact the dependant  
project's author(s) and see if they'll provide a fix and do a new  
release. Then I just update the project so that its install_requries  
field specifies the minimum bug-free version. If it's in the standard  
library though, I file a bug report, but then instead of asking for a  
release for the package in question, I instead have to put a work- 
around into the project, even if the bug has been fixed, since there  
is no way to specify that I just need a fix for one particular package  
and that work-around needs to stay in-place in the project I was  
working on until the minimum required version of Python for that  
project is equal to the Python release which provides the fix. Bleh!

  * What metadata that does exist about the standard library is buried  
in non-standard formats and isn't programmatically accessible. The  
maintainers field is stored in Misc/maintainers.rst, author and  
version is stored as module attributes (__author__ and __version__).  
Ideally this metadata could be collected into setup.cfg files, and  
when installed would live in PEP 376 .egg-info directories, and you  
would replace __version__ attributes with something such as :

    import distutils
    __version__ = distutils.get_distribution('packagename').version


> The big changes I can see from here would be moving the tests to the  
> packages from the central tests directory, and adding a setup.py  
> file or some other form of metadata providion for each package. Not  
> that big now that I've written it ;-)
>


Yeah, this is what I was thinking. It doesn't sound big, until you  
count up the number of packages in the standard library ... there's  
more distributions in there than Zope 3! :P

However, if you are relying on Distutils to write-out the metadata,  
you run into a bootstrapping issue, where you need to use the Python  
interpreter you're installing to install the standard library, but the  
installation requires the standard library. Maybe there are some  
clever ways to solve this, by fiddling with PATHs and installing  
Distutils first or something ...

But perhaps another way to solve the problem is to not use Distutils  
for installation of the working set of distributions that ships with a  
given release of a Python interpreter. You only need to ensure that  
the end-result is the same, and comply with the .egg-info metadata  
format. It really doesn't matter if a package is installed with  
Distutils or not. If the metadata consumed by setup.py files is in  
setup.cfg files (or perhaps some kind of .egg-info templated format  
that the standard lib setup.py files read), then those files could be  
munged by some shell commands, and written out as part of the makefile  
during "make install". (the only tricky bits in the new .egg-info  
format is computing the full-path to all installed files and computing  
the MD5 hash).

Speaking of which, there is one .egg-info file in the standard library  
in the old-style format ... if PEP 376 is accepted then this line in  
CPython's Makefile will become a bug

	@for i in $(srcdir)/Lib/*.py $(srcdir)/Lib/*.doc $(srcdir)/Lib/*.egg- 
info ; \

Although wsgiref is the only project in the standard library with  
metadata, so it'd be easy enough to fix this by just removing it's  
metadata. But if the only package with standard metadata in the  
standard library had it's metadata removed, it would make me sad :(




More information about the stdlib-sig mailing list