[Distutils] PEP 470 Round 2 - Using Multi Index Support for External to PyPI Package File Hosting

holger krekel holger at merlinux.eu
Sat Jun 7 18:28:47 CEST 2014


On Sat, Jun 07, 2014 at 09:46 +1000, Nick Coghlan wrote:
> On 7 Jun 2014 06:08, "Donald Stufft" <donald at stufft.io> wrote:
> >
> >
> > On Jun 6, 2014, at 9:41 AM, holger krekel <holger at merlinux.eu> wrote:
> > >
> > > Once you care for ACLs for indexes and releases you have a number
> > > of issues to consider, it's hardly related to PEP470/PEP438.
> >
> > It is related, because it means that the exact same mechanisms can be
> used,
> > people don’t have to learn two different ways of specifying externally
> hosted
> > projects. In fact it also teaches them how to specify mirrors and the
> like as well
> > something that any devpi user is already going to have to learn how to do.
> 
> This is the key benefit of PEP 470 from my perspective: some aspects of the
> Python packaging ecosystem suffer from a bad case of "too many ways to do
> it", and if we're ever going to fix that, we need to be ruthless in culling
> redundant concepts.
>
> Specifying custom indexes is a feature with a lot of use cases - local
> mirrors and private indexes being two of the big ones. By contrast,
> external references from the simple API duplicate a small subset of the
> custom index functionality in a way that introduces a whole slew of new
> concepts that still need to be documented and learned, even if the advice
> is "don't use that, use custom indexes instead".

Fair point from a UX design perspective -- trying to minimze the concepts
you have to learn.  However, IMO many python users feel far from needing to
know about configuring indexes with pip.  When they try to install a
project with an external reference they will none-theless with PEP470 need 
to know about indices and according options, failure modes etc.  They
will also usually depend on crawling other index sites every time they
perform an install with these options.  

And i think we all agreed at one point that client-side crawling is not
he greatest thing on earth.  Linux distros have an "update" phase
collecting infos from the repos, and a separate install phase.  So you
don't need to go to the remote sites to get index information at
install-time.  With pip you do it at every install.

And, maybe most importantly, for the integrity of their install they
will depend on the operators of this external index.  DNS-Takeover, MITM
or targetted server breakins will not only compromise the server hosting
the index but also compromise all users and companies using that index.
With a pypi-managed checksummed release link the worst that can happen
is that the release file is not there.  We can leverage the integrity of 
PyPI's usually more solid operations to help users not getting something
malicious in the future because they decided at one point to rely on an
external index now turned evil.

> As far as dev-pi goes, if it's only mirroring links rather than externally
> hosted files today, then in the future, it will still automatically mirror
> the external index URLs. Dependency update scanners could follow those
> links automatically, even if pip install doesn't check them by default.

Yes but it's work to get that right.  Simply having checksummed links
from pypi makes things a lot simpler.

best, need to shop for a barbecue now :)
holger

> One other nice consequence of PEP 470 should make it easier for
> organisations to flag and investigate cases where they're relying on an
> upstream source other than PyPI, regardless of whether they care about the
> details of their dependencies' hosting for speed, reliability or legal
> reasons.
>
> >From a migration perspective, how hard would it be to automate generation
> of a custom index page on pythonhosted.org for projects currently relying
> on external references? That would still let us make the client changes
> without needing to special case PIL.
> 
> Also, it occurred to me that while the latest/any split matters for new
> users, we still need to consider the impact on projects which have pinned
> dependencies on older versions of packages that were previously externally
> hosted, but have moved to PyPI for more recent releases. I still think
> dropping the external reference feature from the simple API in favour of
> improving the custom index support is the right to do, but a couple of
> *client side* examples of handling the migration could help clarify the
> consequences for the existing users that may be affected.
> 
> For example, perhaps we should keep "--allow-all-external", but have it
> mean that pip automatically adds new custom index URLs given for the
> requested packages. Even if it emitted a deprecation warning, clients using
> it would keep working in the face of the proposed changes to the simple API
> link handling.
>
> Regards,
> Nick.


More information about the Distutils-SIG mailing list