[Distutils] PEP 426: proposed metadata caching convention

PJ Eby pje at telecommunity.com
Wed Feb 27 22:48:48 CET 2013


On Mon, Feb 25, 2013 at 9:39 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> (This probably belongs in a successor to PEP 376, but I'll leave it
> under the PEP 426 umbrella for now)
>
> One of the points raised regarding PEP 426's integrated metadata
> format is the potential for runtime issues with pkg_resources as it
> reads and processes the metadata during startup, particularly if it
> needs to process any environment markers. While I acknowledge the
> suggestions I have received that we should really be moving away from
> the current filesystem based distributed installation information to a
> real database that properly handle import hooks, I'm looking for
> something simpler that will make it easier for setuptools and
> distribute to consume the new metadata format (and thus hopefully make
> them more amenable to generating it as well)
>
> Assuming we add an Entry-Points field as I have proposed in another
> message, I'd like to propose that installers generate three additional
> cache files as part of the installation process:
>
>     <dist-info-dir>/__cache__/version.txt
>     <dist-info-dir>/__cache__/requires-dist.txt
>     <dist-info-dir>/__cache__/entry-points.txt
>
> version.txt would just be the version of the installed distribution
> (no need to parse the main metadata file just to read the version
> field)
>
> requires-dist.txt would be similar to the pkg_resources requires.txt
> format, but use PEP 426 version specifiers. It would:
> - only contain runtime requirements where the environment markers
> match the current system
> - be split into sections based on the "extras" definition needed to
> get the environment marker to pass
>
> entry-points.txt would be the same format as the pkg_resources entry_points.txt
>
> Cheers,
> Nick.

Since this isn't going to be backwards-compatible anyway, may I suggest that:

1. The caching algorithm be fixed and defined as part of the extension machinery
2. The caching consists of simply copying the data to a file, whose
name is programmatically based on the extension/field name.
3. Environment markers are not processed - that's up to the tool
consuming the cached data

This way, if e.g. entry points are defined as an extension, then the
Builder making a wheel doesn't need to "understand" entry points, it
just has to copy fields to a file.  It allows other resource types
(like i18n/l10n resources) to be defined in the metadata and cached
for runtime use, without needing a metadata version upgrade or any
tool rewrites.  And not processing environment markers means that
pure-Python wheels can still be used by just placing them on sys.path.


More information about the Distutils-SIG mailing list