Author: tarek.ziade Date: Mon Mar 29 11:03:10 2010 New Revision: 79487 Log: reorganized the PEP so implementation details are not mixed with the proposal. Also renamed the directpry to distinfo and the metadata file to METADATA Modified: peps/trunk/pep-0376.txt Modified: peps/trunk/pep-0376.txt ============================================================================== --- peps/trunk/pep-0376.txt (original) +++ peps/trunk/pep-0376.txt Mon Mar 29 11:03:10 2010 @@ -1,5 +1,5 @@ PEP: 376 -Title: Changing the .egg-info structure +Title: Database of Installed Python Distributions Version: $Revision$ Last-Modified: $Date$ Author: Tarek Ziadé <tarek@ziade.org> @@ -10,24 +10,23 @@ Python-Version: 2.7, 3.2 Post-History: +.. contents:: Abstract ======== -The overall goal of this PEP is providing an standard infrastructure to manage -project distributions. This should allow third party tools to do installation, -uninstallation and distribution management in a distutils compatible fashion -and share information between them. - -It also provides a sample uninstall feature using this infrastructure. - -For it, the PEP proposes various enhancements for Distutils: - -- A new format to install projects, as an .egg-info structure. -- New APIs to read a project meta-data -- Replace PEP 262, adding capabilities to record and query information about - installed packages. -- A reference uninstall feature +The goal of this PEP is to provide a standard infrastructure to manage +project distributions installed on a system, so all tools that are +installing or removing projects are interoperable. + +To achieve this goal, the PEP proposes a new format to describe installed +distributions on a system. It also describes a reference implementation +for the standard library. + +In the past an attempt was made to create a installation database (see PEP 262 +[#pep262]_). + +Combined with PEP 345, the current proposal superseds PEP 262. Definitions =========== @@ -64,13 +63,13 @@ Python: - There are too many ways to do it and this makes interoperation difficult. -- There is no API to get the metadata of installed distributions. +- There is no API to get information on installed distributions. How distributions are installed ------------------------------- Right now, when a distribution is installed in Python, every element it -contains are installed in various directories. +contains is installed in various directories. For instance, `Distutils` installs the pure Python code in the `purelib` directory, which is `lib\python2.6\site-packages` for unix-like systems and @@ -99,25 +98,29 @@ - a self-contained `.egg` directory, that contains all the distribution files and the distribution metadata in a file called `PKG-INFO` in a subdirectory - called `EGG-INFO`. `setuptools` creates other fils in that directory that can + called `EGG-INFO`. `setuptools` creates other files in that directory that can be considered as complementary metadata. -- a `.egg-info` directory installed in `site-packages`, that contains the same +- an `.egg-info` directory installed in `site-packages`, that contains the same files `EGG-INFO` has in the `.egg` format. The first format is automatically used when you install a distribution that uses the ``setuptools.setup`` function in its setup.py file, instead of the ``distutils.core.setup`` one. -The `setuptools` project also provides an executable script called +`setuptools` also add a reference to the distribution into an +``easy-install.pth`` file. + +Last, the `setuptools` project provides an executable script called `easy_install` [#easyinstall]_ that installs all distributions, including distutils-based ones in self-contained `.egg` directories. -If you want to have a standalone `.egg.info` directory distributions, e.g. -the second `setuptools` format, you have to force it when you work +If you want to have standalone `.egg-info` directories for your distributions, +e.g. the second `setuptools` format, you have to force it when you work with a setuptools-based distribution or with the `easy_install` script. You can force it by using the `-–single-version-externally-managed` option -**or** the `--root` option. +**or** the `--root` option. This will make the `setuptools` project install +the project like distutils does. This option is used by : @@ -145,21 +148,24 @@ copied in your system. And it's possible to keep track of these files for later removal. +Moreover, the Pip project has gained an `uninstall` feature lately. It +records all installed files, using the `record` option of the `install` +command. + What this PEP proposes ---------------------- To address those issues, this PEP proposes a few changes: -- A new `.egg-info` structure using a directory, based on one format of +- A new `.dist-info` structure using a directory, inspired on one format of the `EggFormats` standard from `setuptools`. - New APIs in `pkgutil` to be able to query the information of installed distributions. -- A de-facto replacement for PEP 262 - An uninstall function and an uninstall script in Distutils. -.egg-info becomes a directory -============================= +One .dist-info directory per installed distribution +=================================================== As explained earlier, the `EggFormats` standard from `setuptools` proposes two formats to install the metadata information of a distribution: @@ -172,57 +178,34 @@ with the metadata inside. This PEP proposes to keep just one format and make it the standard way to -install the metadata of a distribution : a distinct `.egg-info` directory -located in the site-packages directory, containing the metadata. - -This `.egg-info` directory contains a `PKG-INFO` file built by the -`write_pkg_file` method of the `Distribution` class in Distutils. +install the metadata of a distribution : a distinct `.dist-info` directory +located in the site-packages directory, containing the PKG-INFO metadata +file, renamed to METADATA, and some other files. -This change does not impact Python itself because the metadata files are not +This change will not impact Python itself because the metadata files are not used anywhere yet in the standard library besides Distutils. -It does impact the `setuptools` and `pip` projects, but given the fact that +It will impact the `setuptools` and `pip` projects, but given the fact that they already work with a directory that contains a `PKG-INFO` file, the change will have no deep consequences. -Let's take an example of the new format with the `docutils` distribution. -The elements installed in `site-packages` are:: +The syntax of the `dist-info` directory name is as follows:: - - docutils/ - - roman.py - - docutils-0.5.egg-info/ - PKG-INFO - -The syntax of the egg-info directory name is as follows:: - - name + '-' + version + '.egg-info' - -The egg-info directory name is created using a new function called -``egginfo_dirname(name, version)`` added to ``pkgutil``. ``name`` is -converted to a standard distribution name by replacing any runs of -non-alphanumeric characters with a single '-'. ``version`` is converted -to a standard version string. Spaces become dots, and all other -non-alphanumeric characters (except dots) become dashes, with runs of -multiple dashes condensed to a single dash. Both attributes are then -converted into their filename-escaped form, i.e. any '-' characters are -replaced with '_' other than the one in 'egg-info' and the one -separating the name from the version number. + name + '-' + version + '.dist-info' -Examples:: +This `.dist-info` directory will contain these files: - >>> egginfo_dirname('docutils', '0.5') - 'docutils-0.5.egg-info' +- `METADATA`: the metadata, as described in PEP 345, PEP 241 and PEP 214. +- `RECORD`: list of installed files +- `INSTALLER`: the installer that was used +- `REQUESTED`: a marker to now if the project was installed as a dependency + or not. - >>> egginfo_dirname('python-ldap', '2.5') - 'python_ldap-2.5.egg-info' - >>> egginfo_dirname('python-ldap', '2.5 a---5') - 'python_ldap-2.5.a_5.egg-info' +RECORD +------ -Adding a RECORD file in the .egg-info directory -=============================================== - -A `RECORD` file is added inside the `.egg-info` directory at installation +A `RECORD` file is added inside the `.dist-info` directory at installation time when installing a source distribution using the `install` command. Notice that when installing a binary distribution created with `bdist` command or a `bdist`-based command, the `RECORD` file will be installed as well since @@ -240,9 +223,6 @@ This RECORD file is inspired from PEP 262 FILES [#pep262]_. -The RECORD format ------------------ - The `RECORD` file is a CSV file, composed of records, one line per installed file. The ``csv`` module is used to read the file, with these options: @@ -251,22 +231,9 @@ - quoting char : `"`. - line terminator : ``os.linesep`` (so ``\r\n`` or ``\n``) -Each record is composed of three elements. +Each record is composed of three elements: -- the file's full **path** - - - if the installed file is located in the directory where the `.egg-info` - directory of the package is located, it's a '/'-separated relative - path, no matter what the target system is. This makes this information - cross-compatible and allows simple installations to be relocatable. - - - if the installed file is located under ``sys.prefix`` or - `sys.exec_prefix``, it's a it's a '/'-separated relative path prefixed - by the `$PREFIX` or the `$EXEC_PREFIX` string. The `install` command - decides which prefix to use depending on the files. For instance if - it's an executable script defined in the `scripts` option of the - setup script, `$EXEC_PREFIX` will be used. If `install` doesn't know - which prefix to use, `$PREFIX` is preferred. +- the file's full **path** (XXX wait for feedback, rephrasing) - the **MD5** hash of the file, encoded in hex. Notice that `pyc` and `pyo` generated files don't have any hash because they are automatically produced @@ -284,38 +251,18 @@ reading a file produced on a platform that uses a different new line terminator. -Example -------- - -Back to our `docutils` example, we now have:: - - - docutils/ - - roman.py - - docutils-0.5.egg-info/ - PKG-INFO - RECORD - -And the RECORD file contains (extract):: - - docutils/__init__.py,b690274f621402dda63bf11ba5373bf2,9544 - docutils/core.py,9c4b84aff68aa55f2e9bf70481b94333,66188 - roman.py,a4b84aff68aa55f2e9bf70481b943D3,234 - $EXEC_PREFIX/bin/rst2html.py,a4b84aff68aa55f2e9bf70481b943D3,234 - docutils-0.5.egg-info/PKG-INFO,6fe57de576d749536082d8e205b77748,195 - docutils-0.5.egg-info/RECORD +Here's an example of a RECORD file (extract):: -Notice that: + /usr/lib/python2.6/site-packages/docutils/__init__.py,b690274f621402dda63bf11ba5373bf2,9544 + /usr/lib/python2.6/site-packages/docutils/core.py,9c4b84aff68aa55f2e9bf70481b94333,66188 + /usr/lib/python2.6/site-packages/roman.py,a4b84aff68aa55f2e9bf70481b943D3,234 + /usr/local/bin/rst2html.py,a4b84aff68aa55f2e9bf70481b943D3,234 + /usr/lib/python2.6/site-packages/docutils-0.5.dist-info/METADATA,6fe57de576d749536082d8e205b77748,195 + /usr/lib/python2.6/site-packages/docutils-0.5.dist-info/RECORD -- the `RECORD` file can't contain a hash of itself and is just mentioned here -- `docutils` and `docutils-0.5.egg-info` are located in `site-packages` so the file - paths are relative to it. +Notice that the `RECORD` file can't contain a hash of itself and is just mentioned here -Example 2 ---------- - -If a project has files installed elswhere than under the Python installation -root, they are added in the RECORD file as full paths. For example a project -that installs a `config.ini` file in `/etc/myapp` will be added like this:: +A project that installs a `config.ini` file in `/etc/myapp` will be added like this:: /etc/myapp/config.ini,b690274f621402dda63bf11ba5373bf2,9544 @@ -325,8 +272,8 @@ c:\etc\myapp\config.ini,b690274f621402dda63bf11ba5373bf2,9544 -Adding an INSTALLER file in the .egg-info directory -=================================================== +INSTALLER +--------- The `install` command has a new option called `installer`. This option is the name of the tool used to invoke the installation. It's an normalized @@ -337,11 +284,12 @@ It defaults to `distutils` if not provided. When a distribution is installed, the INSTALLER file is generated in the -`.egg-info` directory with this value, to keep track of **who** installed the +`.dist-info` directory with this value, to keep track of **who** installed the distribution. The file is a single-line text file. -Adding a REQUESTED file in the .egg-info directory -================================================== + +REQUESTED +--------- Some install tools automatically detect unfulfilled dependencies and install them. In these cases, it is useful to track which @@ -350,7 +298,7 @@ to the orphaned dependency. If a distribution is installed by direct user request (the usual -case), a file REQUESTED is added to the .egg-info directory of the +case), a file REQUESTED is added to the .dist-info directory of the installed distribution. The REQUESTED file may be empty, or may contain a marker comment line beginning with the "#" character. @@ -364,31 +312,48 @@ If a package that was already installed on the system as a dependency is later installed by name, the distutils ``install`` command will -create the REQUESTED file in the .egg-info directory of the existing +create the REQUESTED file in the .dist-info directory of the existing installation. -New APIs in pkgutil -=================== -To use the `.egg-info` directory content, we need to add in the standard +Implementation details +====================== + +New functions and classes in pkgutil +------------------------------------ + +To use the `.dist-info` directory content, we need to add in the standard library a set of APIs. The best place to put these APIs is `pkgutil`. -Query functions ---------------- +Functions +~~~~~~~~~ + +The new functions added in the ``pkgutil`` module are : + +- ``distinfo_dirname(name, version)`` -> directory name + + ``name`` is converted to a standard distribution name by replacing any + runs of non-alphanumeric characters with a single '-'. -The new functions added in the ``pkgutil`` are : + ``version`` is converted to a standard version string. Spaces become + dots, and all other non-alphanumeric characters (except dots) become + dashes, with runs of multiple dashes condensed to a single dash. + + Both attributes are then converted into their filename-escaped form, + i.e. any '-' characters are replaced with '_' other than the one in + 'dist-info' and the one separating the name from the version number. - ``get_distributions()`` -> iterator of ``Distribution`` instances. - Provides an iterator that looks for ``.egg-info`` directories in + Provides an iterator that looks for ``.dist-info`` directories in ``sys.path`` and returns ``Distribution`` instances for each one of them. - ``get_distribution(name)`` -> ``Distribution`` or None. Scans all elements in ``sys.path`` and looks for all directories ending with - ``.egg-info``. Returns a ``Distribution`` corresponding to the - ``.egg-info`` directory that contains a PKG-INFO that matches `name` + ``.dist-info``. Returns a ``Distribution`` corresponding to the + ``.dist-info`` directory that contains a METADATA that matches `name` for the `name` metadata. This function only returns the first result founded, as no more than one @@ -400,11 +365,11 @@ ``path`` can be a local absolute path or a relative '/'-separated path. Distribution class ------------------- +~~~~~~~~~~~~~~~~~~ A new class called ``Distribution`` is created with the path of the -`.egg-info` directory provided to the constructor. It reads the metadata -contained in `PKG-INFO` when it is instanciated. +`.dist-info` directory provided to the constructor. It reads the metadata +contained in `METADATA` when it is instanciated. ``Distribution(path)`` -> instance @@ -415,7 +380,7 @@ - ``name``: The name of the distribution. - ``metadata``: A ``DistributionMetadata`` instance loaded with the - distribution's PKG-INFO file. + distribution's METADATA file. - ``requested``: A boolean that indicates whether the REQUESTED metadata file is present (in other words, whether the package was @@ -437,25 +402,25 @@ Returns ``True`` if ``path`` is listed in `RECORD`. ``path`` can be a local absolute path or a relative '/'-separated path. -- ``get_egginfo_file(path, binary=False)`` -> file object +- ``get_distinfo_file(path, binary=False)`` -> file object - Returns a file located under the `.egg-info` directory. + Returns a file located under the `.dist-info` directory. Returns a ``file`` instance for the file pointed by ``path``. - ``path`` has to be a '/'-separated path relative to the `.egg-info` + ``path`` has to be a '/'-separated path relative to the `.dist-info` directory or an absolute path. - If ``path`` is an absolute path and doesn't start with the `.egg-info` + If ``path`` is an absolute path and doesn't start with the `.dist-info` directory path, a ``DistutilsError`` is raised. If ``binary`` is ``True``, opens the file in read-only binary mode (`rb`), otherwise opens it in read-only mode (`r`). -- ``get_egginfo_files(local=False)`` -> iterator of paths +- ``get_distinfo_files(local=False)`` -> iterator of paths Iterates over the `RECORD` entries and returns paths for each line if the path - is pointing to a file located in the `.egg-info` directory or one of its + is pointing to a file located in the `.dist-info` directory or one of its subdirectories. If ``local`` is ``True``, each path is transformed into a @@ -467,27 +432,36 @@ more details [#pep273]_). These classes are described in the documentation of the prototype implementation for interested readers [#prototype]_. -Usage example -------------- +Examples +~~~~~~~~ Let's use some of the new APIs with our `docutils` example:: - >>> from pkgutil import get_distribution, get_file_users + >>> from pkgutil import get_distribution, get_file_users, distinfo_dirname >>> dist = get_distribution('docutils') >>> dist.name 'docutils' >>> dist.metadata.version '0.5' + >>> distinfo_dirname('docutils', '0.5') + 'docutils-0.5.dist-info' + + >>> distinfo_dirname('python-ldap', '2.5') + 'python_ldap-2.5.dist-info' + + >>> distinfo_dirname('python-ldap', '2.5 a---5') + 'python_ldap-2.5.a_5.dist-info' + >>> for path, hash, size in dist.get_installed_files():: ... print '%s %s %d' % (path, hash, size) ... - docutils/__init__.py b690274f621402dda63bf11ba5373bf2 9544 - docutils/core.py 9c4b84aff68aa55f2e9bf70481b94333 66188 - roman.py a4b84aff68aa55f2e9bf70481b943D3 234 - /usr/local/bin/rst2html.py a4b84aff68aa55f2e9bf70481b943D3 234 - docutils-0.5.egg-info/PKG-INFO 6fe57de576d749536082d8e205b77748 195 - docutils-0.5.egg-info/RECORD None None + /usr/lib/python2.6/site-packages/docutils/__init__.py,b690274f621402dda63bf11ba5373bf2,9544 + /usr/lib/python2.6/site-packages/docutils/core.py,9c4b84aff68aa55f2e9bf70481b94333,66188 + /usr/lib/python2.6/site-packages/roman.py,a4b84aff68aa55f2e9bf70481b943D3,234 + /usr/local/bin/rst2html.py,a4b84aff68aa55f2e9bf70481b943D3,234 + /usr/lib/python2.6/site-packages/docutils-0.5.dist-info/METADATA,6fe57de576d749536082d8e205b77748,195 + /usr/lib/python2.6/site-packages/docutils-0.5.dist-info/RECORD >>> dist.uses('docutils/core.py') True @@ -495,34 +469,15 @@ >>> dist.uses('/usr/local/bin/rst2html.py') True - >>> dist.get_egginfo_file('PKG-INFO') + >>> dist.get_distinfo_file('METADATA') <open file at ...> >>> dist.requested True -PEP 262 replacement -=================== -In the past an attempt was made to create a installation database (see PEP 262 -[#pep262]_). - -Extract from PEP 262 Requirements: - - " We need a way to figure out what distributions, and what versions of - those distributions, are installed on a system..." - - -Since the APIs proposed in the current PEP provide everything needed to meet -this requirement, PEP 376 replaces PEP 262 and becomes the official -`installation database` standard. - -The new version of PEP 345 (XXX work in progress) extends the Metadata -standard and fullfills the requirements described in PEP 262, like the -`REQUIRES` section. - -Adding an Uninstall function -============================ +New functions in Distutils +-------------------------- Distutils already provides a very basic way to install a distribution, which is running the `install` command over the `setup.py` script of the @@ -545,7 +500,7 @@ If the distribution is not found, a ``DistutilsUninstallError`` is be raised. Filtering ---------- +~~~~~~~~~ To make it a reference API for third-party projects that wish to control how `uninstall` works, a second callable argument can be used. It's @@ -570,10 +525,10 @@ its own uninstall feature. Installer marker ----------------- +~~~~~~~~~~~~~~~~ As explained earlier in this PEP, the `install` command adds an `INSTALLER` -file in the `.egg-info` directory with the name of the installer. +file in the `.dist-info` directory with the name of the installer. To avoid removing distributions that where installed by another packaging system, the ``uninstall`` function takes an extra argument ``installer`` which default @@ -596,7 +551,7 @@ it has to undo at uninstallation time. Adding an Uninstall script -========================== +~~~~~~~~~~~~~~~~~~~~~~~~~~ An `uninstall` script is added in Distutils. and is used like this:: @@ -613,12 +568,17 @@ Backward compatibility and roadmap ================================== -These changes don't introduce any compatibility problems with the previous -version of Distutils, and will also work with existing third-party tools. +These changes don't introduce any compatibility problems since they +will be implemented in: + +- pkgutil in new functions +- distutils2 + +The plan is to include the functionality outlined in this PEP in pkgutil for +Python 2.7 and Python 3.2, and in Distutils2. -The plan is to include the functionality outlined in this PEP in distutils for -Python 2.7 and Python 3.2. A backport of the new distutils for 2.5, 2.6, 3.0 -and 3.1 is provided so people can benefit from these new features. +Distutils2 will also contain a backport of the new pgkutil, and can be used for +2.4 onward. Distributions installed using existing, pre-standardization formats do not have the necessary metadata available for the new API, and thus will be