[Catalog-sig] Deprecate External Links

PJ Eby pje at telecommunity.com
Thu Feb 28 01:08:43 CET 2013


On Wed, Feb 27, 2013 at 6:16 PM, Aaron Meurer <asmeurer at gmail.com> wrote:
> As far as I'm concerned, this is all about helping package
> maintainers.  The way pip works now, every time I do a release
> candidate, pip automatically installs it, even though I only upload it
> to Google Code.  I don't want it to do this, but the only way around
> it would be either 1. give it some weird name so that pip doesn't
> think it is newer 2. upload it somewhere else or 3. go in to PyPI and
> remove all mentions of Google Code from the index.

There's also a *fourth* way, which I asked the PyPI developers many
years ago to do, which is to stop including download links on the
/simple index for "hidden" (i.e., non-current) releases.

(Something I am still in favor of, btw.  Jim Fulton argued against it,
IIRC, and it ended in a stalemate.  However, I don't think we
discussed distinguishing PyPI downloads from other downloads, just
getting rid of old links in general)

Frankly, just dropping /simple links for hidden releases would also
fix a good chunk of expired domain, stale releases, too many
downloads, etc.  In addition, if a project migrates to using PyPI
uploads, they will not still be subject to external downloads for
older versions being crawled.

So, if we must do away with the links, I would suggest that the phases be:

1. Remove homepage/download URLs for "hidden" versions from the
/simple index altogether (leaving PyPI download links available)
2. Remove the rel="..." attributes from the remaining download and
home page links (this will stop off-site crawling, but not off-site
downloading)
3. Re-evaluate whether anything else actually needs to be removed.

Basically, 99% of the complaints here are lumping together all of
these different kinds of links -- stale links, spidered links, and
plain external download links -- even though they don't create the
same sorts of problems.  Taking it in stages will give authors time to
change processes, while still getting rid of the biggest problem
sources right away (stale homepage/download URLs).

The first of these changes could be done now, though I'd check with
Jim about the buildout use case;  IIRC it was to allow pinned
versions.  But if the main use cases also had eggs on PyPI rather than
downloading them from elsewhere, then removing *just* the
homepage/download links would clean things up nicely, including your
runaway Google Code downloads, without needing to change any installer
code that's out in the field right now.


More information about the Catalog-SIG mailing list