[Distutils] PEP 503 - Simple Repository API

M.-A. Lemburg mal at egenix.com
Mon Sep 7 23:05:15 CEST 2015


On 05.09.2015 18:12, Donald Stufft wrote:
> On September 5, 2015 at 5:43:58 AM, M.-A. Lemburg (mal at egenix.com) wrote:
>>  
>> Hmm, if the installer will build the URL itself, why is there even
>> a need for a top-level index page ?
>>  
>> I mean for the occasional human reading the page it will certainly
>> make sense to have such a page, but for the API this doesn't
>> appear to be essentially needed.
>>  
>> Or is the idea to have the package manager scan the index for package
>> hosted on that index prior to asking for the package it would like
>> to install ?
> 
> The latest versions of pip won't use it, setuptools and older versions of pip
> will use it though. The versions of pip/setuptools that would use it, use it as
> a fallback. They don't pre-normalize the name before requesting the URL so they
> just used whatever the user typed. This comes from when a project like "Django"
> was at /simple/Django/ on PyPI but if a user typed ``pip install django`` it
> would first fetch /simple/django/ and if that 404'd it would fall back onto
> /simple/ and look for these links. On PyPI this rarely happened because PyPI
> redirects /simple/anything-that-normalizes-to-the-name/ to the correct URL but
> it's useful for static repositories that don't have something to redirect it
> in front.
> 
> I've tried to make it so that all of the SHOULD and MUST directives can be
> implement by a standard Nginx/Apache/whatever web server with static files
> while maintaining compatability with older installers.

Yes, understood, and that's good.

Perhaps having an index page is a good thing if we want
package managers to implement search functionality.

>> Would it help the package manager to more easily detect the links
>> that point to distribution files instead of e.g. documentation or
>> other resources ?
>>  
>> setuptools uses rel="download" for this:
>>  
>> https://pythonhosted.org/setuptools/easy_install.html#package-index-api 
> 
> This is actually for the link spidering that PEP 470 removed, links marked with
> either rel="download" or rel="homepage" would be fetched (unless they looked
> installable) and searched for additional links before PEP 438/470 started to
> deprecate/remove them. Both setuptools and pip only need a simple page that has
> links that point to files on the, see for example the /simple/ page for
> requests: https://pypi.python.org/simple/requests/

Right. Perhaps I should have made the use case I'm thinking of
more obvious:

If you set up a page with links to projects and distribution files,
you will likely not make completely unstyled but instead integrate
it into some website which also has lots of other links to e.g.
other parts of the website, images, documentation, etc.

In such a setup, the package manager would see lots and lots of
links which are not relevant for the task. With the rel attributes,
the package manager can focus on those links which are relevant.
That's also the main reason for having those rel links in setuptools.

>> Could we perhaps also add optional features like:
>>  
>> * Distribution link elements MAY include a data-gpg-sig=""
>> attribute to provide a GPG signature of the linked file
>>  
>> This could later be extended to more meta data, such as platform
>> tags, distribution file types, license info, mirror locations,
>> documentation, help strings, etc.
> 
> I actually forgot to mention the GPG signatures, currently the assumption is
> that if a GPG signature exists it will live at the same location as the file
> with a .asc on the end, so if the file is /packages/Django-1.0.tar.gz then the
> GPG signature will be located at /packages/Django-1.0.tar.gz.asc. I'll add this
> to the PEP.

Hmm, that's convention based and does not allow detecting
the presence of such signatures without actually trying a download.

I think it would be better to make the availability explicit
by adding an attribute to the link element (just like for other
such features).

> I don't want to add more features to the API, particularly not in this PEP. My
> longer term plan is to work on a a new API for installers to use which will be
> easier to work with. The current API is great for it's simplicity and the fact
> it can be implemented on the server side with nothing more than a directory
> structure full of files and python -m http.server. The plan in my head is to
> add a new repository API which can handle the more complex cases AND which will
> most likely be JSON based to simplify parsing of it. The simple API would not
> be deprecated, it would just be up to the repository which "version" of the API
> they use. For people hosting their own repositories, if they have a simple case
> they will be able to get away with the simple API, but the more complex API
> would offer benefits like being able to access the metadata information without
> downloading the file.

A dynamic API is nice to have for more complex queries, but there's
nothing like a set of static files which you can just put up on
a file server or CDN to deploy. I think there's a lot more meta data
which can be put into such a static version of a repository.

The idea of trying to put meta data into file names doesn't work
(tried that, failed every time :-)). Conventions like the .asc sig
idea can go a little further, but fails badly when those conventions
result in lots of needless 404s requests.

HTML is all about meta data, HTML5 even more, so why not use it for
this purpose ?

A dynamic API can be added as addition, but is hardly ever required
for installation.

Anyway, I can understand why you would not want to make the PEP
more complex, so all this is just an attempt to push a little
in the above direction and at least make some of things optional
standard extensions of the standard (must like we do for the DB-API
on the DB-SIG list).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 07 2015)
>>> Python Projects, Coaching and Consulting ...  http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ...       http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2015-08-27: Released eGenix mx Base 3.2.9 ...     http://egenix.com/go83
2015-09-18: PyCon UK 2015 ...                              11 days to go

::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


More information about the Distutils-SIG mailing list