[Catalog-sig] V3 PEP-draft for transitioning to pypi-hosting of release files

Carl Meyer carl at oddbird.net
Thu Mar 14 01:16:30 CET 2013

On 03/13/2013 01:33 PM, M.-A. Lemburg wrote:
> The proposal marks all external links as evil, 

I'm sorry the text of the PEP gave you that impression. I can see how
you'd have gotten it from some of the comments here on catalog-sig, but
we went to some lengths to avoid it in the PEP text, and plan to further
revise the text to try harder to avoid that implication.

In the proposed PEP, we are attempting to balance two things that I
believe to be true:

1) There are good and valid reasons for some package owners to prefer
external hosting, and it is good for automated installers to easily be
able to install such packages (on user request).

2) Installing non-PyPI-hosted packages should not be the *default*
behavior of installer tools, for many reasons, among them because that
is unusual and surprising behavior to many newcomers to the Python
ecosystem, and often leads to concerns on their part about the stability
of the ecosystem.

These are the axioms, if you will, of this proposal, and while I'd guess
many people in this discussion are at least slightly uncomfortable with
one or the other of them, I think accepting both is the most likely path
to a compromise everyone can live with.

I think we can find a solution that embraces both these axioms and
maintains good backwards-compatibility and usability. Holger and I had a
long talk this evening about that, and here are some of our thoughts:

A) You mentioned opt-in PyPI caching of externally-hosted files as a
means to improve reliability. We basically agree, but implementing this
on the PyPI side adds complexity to the PyPI implementation that we are
hesitant to propose. Rather, we propose that this is better handled by a
client-side tool that you point at a PyPI release with externally-hosted
files, and it simply copies those release files onto PyPI. This has
essentially the same effect. We envision this being a simple enough tool
that it could reasonably be run for every release of a project in an
ongoing way, not just as a one-time project-wide migration. We plan to
change the line in the PEP that says the existence of this tool is NOT
REQUIRED to begin the phase 2 transition to instead say that the
existence of this tool IS REQUIRED before the phase 2 transition begins.
(Holger already has a partial implementation of this tool.)

B) We also plan to change the PEP to say even more strongly that
installer tools should provide an easy option for installing
externally-hosted projects, and that our definition of "easy" includes
the ability for an installer to automatically tell a user what options
they can use to install a specific externally-hosted package that the
tool is refusing to install by default.

C) To make that latter part of (B) easier, we also propose that the
basic simple index include a link with a distinct rel attribute that
points to the -with-externals index page for that project, only for a
package that has external links. This way even tools using the
no-externals index by default can notify users of the existence of
external links for a project when they try to install it.

There's also another possible change, a bit more significant, that we
discussed that I'd be curious to hear your thoughts on. The initial
motivation for separating external links from the main simple/ index was
twofold: 1) Allow future tools to distinguish between internal and
external links without every tool needing to implement host-comparison
algorithms (which may break indexes that host "internal" files on a
CDN), and 2) Allow today's installers, without upgrade, to automatically
migrate eventually to no-external-installs-by-default.

Some things have caused us to re-evaluate these points:

- PyPI can automatically tag internal/external links in the simple index
with rel="internal" and rel="external", which gives future tools a more
reliable marker than host-comparison. So this takes care of #1.

- It may be that giving up #2 is acceptable in the interest of better
backward-compatibility. Old tools will still gain most of the benefits
of this PEP due to the eventual elimination of automatic link-scraping
(both from metadata and external pages) and the move to explicit
submission of external links, only for those projects that want them.
And old tools will not be able to provide a useful error message to
users trying to install an externally-hosted package that is no longer
listed in the main simple/ index, which is a bad usability breakage.

Given that, we are thinking of perhaps simplifying the PEP to eliminate
the separate -with-externals index, and list external links in the main
simple/ index, clearly marked with rel="external". The PEP would still
recommend that future installer tools not follow rel="external" links
without specific user authorization. Old tools still get many of the
benefits, without the breakage.

> and instead of
> making external links more secure, the user is left with the option
> to either not enable external links at all, or to let the
> "devil" in :-)

There is no "instead of." There are parallel proposals (see the TUF
thread) to improve the security of the ecosystem, and those proposals
are not mutually exclusive with this one. If you search the PEP text,
note that you don't find the words "secure" or "security" anywhere
within it, or any claims of security achieved by this proposal alone.
There is a brief mention of MITM attacks, which is relevant to the PEP
because avoiding external link-crawling does reduce that attack surface,
even if other proposals will also help with that (even more).

Thanks for taking the time to read all this! Looking forward to hearing
your thoughts,


More information about the Catalog-SIG mailing list