standard metadata for the standard library (Was: Re: [Python-Dev] sharing stdlib across python implementations)

On Sep 30, 2009, at 7:28 AM, Chris Withers wrote:
Frank Wierzbicki wrote:
Talk has started up again on the stdlib-sig list about finding a core stdlib + tests that can be shared by all implementations, potentially living apart from CPython. I have volunteered to put together a PEP on the subject, with Jessie Noller and Brett Canon are helping me out. When I have something worth showing, I'll start the real PEP process.
I'm on on stdlib-sig and I'm afraid I don't have the bandwidth to start on it, but I'd just like to throw in (yet again) that it would be great if the stdlib was actually a set of separate python packages with their own version metadata so that packaging tools could manage them, and upgrade them independently of python packages when there are bug fixes.
Amen! Currently there are a number of penalties for package-savy developer to use packages in the standard library, since they can't use their normal tool chains to work with the standard library. Instead it has to be treated as special cases. Aside from the annoying if-else statements used to build-up install_requires fields, a few other problems the lack of metadata for the standard library poses: * Install tools work differently with 3rd party packages that have been added to the standard library. For example, simplejson and easy_install. easy_install is not supposed to upgrade a distribution if it's already installed unless the -U switch is supplied. However, do an "easy_install simplejson" (no -U switch) with Python 2.6 and the distribution is unexpectedly upgraded. This is because the metadata has been tossed out once the distribution was incorporated to the standard lib. * Bug fixes are harder. If I'm working on a project which depends upon another project, and I find a bug in that dependant project, then the preferred route to solve that problem is contact the dependant project's author(s) and see if they'll provide a fix and do a new release. Then I just update the project so that its install_requries field specifies the minimum bug-free version. If it's in the standard library though, I file a bug report, but then instead of asking for a release for the package in question, I instead have to put a work- around into the project, even if the bug has been fixed, since there is no way to specify that I just need a fix for one particular package and that work-around needs to stay in-place in the project I was working on until the minimum required version of Python for that project is equal to the Python release which provides the fix. Bleh! * What metadata that does exist about the standard library is buried in non-standard formats and isn't programmatically accessible. The maintainers field is stored in Misc/maintainers.rst, author and version is stored as module attributes (__author__ and __version__). Ideally this metadata could be collected into setup.cfg files, and when installed would live in PEP 376 .egg-info directories, and you would replace __version__ attributes with something such as : import distutils __version__ = distutils.get_distribution('packagename').version
The big changes I can see from here would be moving the tests to the packages from the central tests directory, and adding a setup.py file or some other form of metadata providion for each package. Not that big now that I've written it ;-)
Yeah, this is what I was thinking. It doesn't sound big, until you count up the number of packages in the standard library ... there's more distributions in there than Zope 3! :P However, if you are relying on Distutils to write-out the metadata, you run into a bootstrapping issue, where you need to use the Python interpreter you're installing to install the standard library, but the installation requires the standard library. Maybe there are some clever ways to solve this, by fiddling with PATHs and installing Distutils first or something ... But perhaps another way to solve the problem is to not use Distutils for installation of the working set of distributions that ships with a given release of a Python interpreter. You only need to ensure that the end-result is the same, and comply with the .egg-info metadata format. It really doesn't matter if a package is installed with Distutils or not. If the metadata consumed by setup.py files is in setup.cfg files (or perhaps some kind of .egg-info templated format that the standard lib setup.py files read), then those files could be munged by some shell commands, and written out as part of the makefile during "make install". (the only tricky bits in the new .egg-info format is computing the full-path to all installed files and computing the MD5 hash). Speaking of which, there is one .egg-info file in the standard library in the old-style format ... if PEP 376 is accepted then this line in CPython's Makefile will become a bug @for i in $(srcdir)/Lib/*.py $(srcdir)/Lib/*.doc $(srcdir)/Lib/*.egg- info ; \ Although wsgiref is the only project in the standard library with metadata, so it'd be easy enough to fix this by just removing it's metadata. But if the only package with standard metadata in the standard library had it's metadata removed, it would make me sad :(
participants (1)
-
Kevin Teague