[Catalog-sig] Deprecation of External Urls, Statistics

PJ Eby pje at telecommunity.com
Fri Mar 8 20:54:28 CET 2013

On Fri, Mar 8, 2013 at 8:13 AM, Donald Stufft <donald at stufft.io> wrote:
> It does solve the backwards compatibility issue of killing external urls immediately so I'm not flat out against it, but there may be legal issues involved too?

I've mentioned this in the other thread as well, but the best way to
actually ensure this stuff gets moved over to PyPI is to make it
*easy*.  Give developers a button to click on PyPI that fetches all
their external links (requiring first that you give matching MD5 or
other checksums) and uploads them to PyPI, and a whole bunch of those
projects are likely to be okay with clicking it a few times.  A
command-line tool to do it (especially as a distutils/setuptools
command) would be a good idea, too.

Of the tiny minority of remaining people who object to PyPI hosting
for some reason other than convenience/familiarity (e.g. MAL's
licensing objection), it will likely be sufficient to provide an
option to add #md5 links to their description, in lieu of actual

FWIW, it's hard to get people to change behavior when one condemns
that behavior as unlikeable or socially undesirable, because it means
one is less likely to consider the other person's motivations, needs,
etc., and on top of that, the other person's resistance and rebellion
are stirred up by being the subject of one's disapproval.

So please, let's all stop talking about ways to work around the
package authors and project maintainers, or how to force them into
doing our bidding, and start talking instead about how to make it
*easy* and *obvious* for them to do what we want.

(And people who think it's already easy and obvious enough, so those
10% of projects must be stupid, will obviously not have anything
positive to contribute.)

So let me kick off that discussion with a list of known-so-far use
cases for external hosting, in descending order of my extremely rough
guesstimate of frequency:

* Always did it that way, never saw a reason to change, or didn't know
you could upload to PyPI
* Lots of files that are currently generated on the system where
they're hosted, or in an automated system that would need significant
rework to support PyPI
* Development snapshots (which may in fact be depended upon by other
in-development projects, so manual URL specification doesn't help
* Had an issue w/PyPI availability in the past
* Objectors to PyPI's licensing requirements

Automation is aimed at the first two: make it easy enough, w/a carrot
and a stick ("external link spidering is going away, you have to put
either the links or the files on PyPI directly if you want them
found"), and a lot of people will move (assuming they're actually
still maintaining their project).

Development snapshots are an interesting case, because one of the
reasons they're valuable is that PyPI's existing multi-release
behavior is a major PITA.  You can't upload a new version of something
without PyPI creating a new release for it...  and automatically
hiding all your previous releases, including your stable release.
There's a lot that would have to be done to PyPI's release management
before it would actually be sane to track such releases there.  So the
obvious fix is to do nothing; such links being external doesn't hurt
availability for people that don't depend on them (unlike
rel=homepage/download links).

The last two issues are education/persuasion problems that won't be
affected by technology changes.

Does anybody know of any other use cases for the thousands of projects
and releases relying on external link discovery spidering?

(Disparaging remarks about why a particular use case is bad, no good,
makes you go blind, etc. need not apply: they serve only to show that
the person providing the opinion lacks sufficient empathy with the
target audience to be *useful* in a discussion of how to persuade that
target audience to behave differently.)

More information about the Catalog-SIG mailing list