[Catalog-sig] Extending the package meta-data with more detailed download information

M.-A. Lemburg mal at egenix.com
Mon Nov 23 10:23:26 CET 2009


Tarek Ziadé wrote:
> On Thu, Nov 19, 2009 at 11:37 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> In the current or intended next vesion (1.2 - see PEP 345), the
>> package meta data does not include any machine usable form of
>> defining download URLs for particular platforms, Python versions
>> and variants.
>>
>> The only entry we have is:
>>
>> """
>> Download-URL
>>    A string containing the URL from which this version of the package can be downloaded. (This
>> means that the URL can't be something like ".../package-latest.tgz", but instead must be
>> ".../package-0.45.tgz".)
>> """
>>
>> which may be usable by a developer looking for the download links,
>> but isn't really suited for package managers to use.
>>
>> PyPI has already extended the meta-data information to include uploaded
>> files, but only makes this information available via the RPC interface.
>>
>> Now I'm not sure whether such download information should be part
>> of the package's meta-data, but do see a point in having all package
>> related information in one place for easy access by package managers
>> and developers.
>>
>> I would like to extend the available download information to make
>> automated downloads more reliable. Here's a list of things that
>> would be needed:
>>
> 
> So, if I understand correctly, you would generate automatically all
> those extra meta-data within the same "Distribution-File" field, when
> the distribution is built ?

Yes, you could have distutils generate most of those fields,
except for the URL and comment.

For one, the developer will run the setup.py several times to build
distribution files for various platforms and then upload the
generated files to some server or PyPI.

As a result, you'd have to append one "Distribution-File" entry
per generated distribution file and allow the developer to customize
the details (e.g. URL, comment, etc.).

What's not clear yet is how to get the data into the meta file.
Perhaps it shouldn't go there and instead you have PyPI generate
the extra fields in the PKG-INFO file based on what it knows
a package and its distribution files.

> Why they are all grouped under a single field though ?

Because you can have more than just one distribution file
for a package.

>>  * Distribution type (sdist, bdist_egg, bdist_msi, bdist_wininst, etc.)
>>  * Distribution URL (full URL of the download file)
>>  * Distribution Comment (any text)
>>  * Distribution MD5 digest (as HEX string)
>>  * Distribution SHA1 digest (as HEX string)
>>  * Distribution PGP signature (as string)
>>  * Distribution variant (list defined by the package)
> 
> Nice ! Although I am not sure about "Comment". What will it provide
> that we can't provide in Summary and Description ?

It's a per-distribution file comment and mimics what we have
on PyPI. This is a developer edited field.

Apart from the variant and sha1 fields, the above fields are already
implemented in PyPI.

> What's "Distribution variant" ?

This optional field is meant for the developer to use on a per
package basis.

It could be used to flag distribution files that
have certain features enabled or not (e.g. a distribution file
that includes debug, coverage code or tests vs. one that
doesn't).

Developers are free to use the variant field for their own
purposes. Package managers should provide an option to define
the variant the user intends to install.

It's an easy way to provide several different builds for the
same target platform. Without variants, the only alternative
would be creating completely new packages.

Here's an example of a C extension:

mylib-1.2.3.exe        - default variant
mylib-1.2.3-debug.exe  - 'debug' variant built with debug code enabled
mylib-1.2.3-devel.exe  - 'devel' variant with header files included

>>  * Python implementation (CPython, Jython, etc.)
>>  * Python version (2.5, 3.1, etc.)
>>  * Python build variant (UCS2, UCS4)
>>  * OS identifier (Windows, Linux, Mac OS X, FreeBSD, etc.)
>>  * OS version (XP, 2, 10.4, 7, etc.)
>>  * Architecture identifier (x86, x64, ppc, ppc64, sparc, sparc64, etc.)
>>  * Processor identifier (i386, i686, arm, etc.)
> 
> What are those useful for ?
> 
> We are currently adding in PEP 345 fields such as "Requires-Python",
> that allows to describe
> the Python versions that are compatible with this distribution, so I
> am not sure what these build
> information field will be useful for.

The above fields are meant to be able to match a distribution
file to a target installation. Since a developer will often
build distribution files for several platforms, you have to
include the build information for each file that you create.

Those fields are actually the main addition to what we
already have on PyPI. They make it possible for a package
manager tool to determine the right file to download
automatically in a much safer way than is currently possible
by trying to parse the distribution file's file name.

Regards,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Nov 23 2009)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


More information about the Catalog-SIG mailing list