[Catalog-sig] transition to pypi-hosting through server-side changes
holger krekel
holger at merlinux.eu
Sat Mar 9 08:22:22 CET 2013
Hi all,
i think Philip Eby brought up a very worthwhile idea to consider:
if we can transition to a no-external-hosting situation by making
pypi-server changes without requiring client-side installers or
releases processes to change, that would be great. We would
have one place to implement things, and less friction on the probably
millions of places where pip/easy_install and CI/release processes
are used today.
Basically all revolves around the issue of what links are
served on the simple/* pages.
What about adding a "hosting mode" field to a package which effects
all historic and future releases, i.e. the mode is not specific to a
particular release but to all releases. This field could have these
values and meanings:
- "pypi-only": homepage/download links are not added to simple/ pages
unless they are #egg ones. Release registration with a non-empty and
non-#egg download url is rejected. client-side tools will not need to
crawl or download anything externally unless requring an #egg
development tarball.
- "pypi-cache": homepage/download pages are crawled at the pypi server side
exactly once at release registration time. Or once at "transition" time
when an author chooses to have his externally hosted release files be
served from pypi.
- "pypi-linkext": homepage/download urls are crawled at the pypi server
side for release files, and the simple/ page serves links to them without
requiring client-side tools to crawl external sites for determining
the set of candidate release files. Legally, this should not pose
a problem because the files are still hosted externally so we could
at some point automatically switch projects to this mode.
- "pypi-ext": like it is today: homepage/download urls are presented in
simple/ pages and client-side tools need to crawl them themselves to
find release file links.
Now it is a matter of choosing good defaults and designing friendly
user interactions to allow package maintainers to move to at least pypi-cache
or best "pypi-only" mode. My current thoughts on this:
- 90% of the projects could directly get the "pypi-only" mode as a default
according to Donald's statistics. They'd still receive a mail
with a link to a page where they can change the mode, if needed.
And of course the friendly information that "pypi-only" provides
the fastest and most reliable way for users to install their package.
- 10% of the projects having external release files:
- if they have their newest releases on pypi already, they could get
a "linkext" mode so that client-side tools will not need to crawl
and not need to download from external sites, if they only
look for the newest release
- if they have their newest release on pypi, they could get "ext" mode
as default
in either case, maintainers/authors get a mail with a link to the page
where they can change the mode. And with information about the time frame
for phasing out particular modes:
- pypi-ext: in N months we automatically switch this mode to pypi-linkext
- in N+M months only "pypi-only" and "pypi-cache" is allowed.
With the latter you can still host your files externally but you need to
accept that pypi caches release files at release registration time and
serves them afterwards itself.
If you do not agree, your release files will not be automatically
discoverable anymore and you need to tell your users how to install
things manually through the descrition of your package.
- (and maybe: in N+M+X months only pypi-hosted is allowed as a mode)
I think this (or a variation/refinements of this scheme) would offer a
smooth transition where nobody needs to get upset and people would clearly
see we are doing everything we can to make it easy to transition.
cheers,
holger
More information about the Catalog-SIG
mailing list