[Catalog-sig] pre-PEP: transition to release-file hosting at pypi site

Donald Stufft donald at stufft.io
Tue Mar 12 21:23:14 CET 2013

On Mar 12, 2013, at 4:14 PM, Carl Meyer <carl at oddbird.net> wrote:

> On 03/12/2013 01:21 PM, PJ Eby wrote:
>>> - In some way, migrate to a situation where the popular installer tools
>>> install only release files from PyPI by default, but are capable of
>>> installing from other locations if the user provides an option.
>> Perhaps I'm confused, but ISTM that every time I've said this, Donald
>> and Lennart argue that it should not be possible to provide such an
>> option -- or to be more specific, that PyPI should not publish the
>> information that makes that option possible.
>> If that's *not* the position they're taking, it'd be good to know,
>> because we could totally stop arguing about it in that case.
> I think there's been misunderstanding on this point. Donald and Lennart
> can confirm for themselves, but I don't believe _anyone_ thinks that
> tools should not be able to install from non-PyPI sources when
> explicitly requested to do so. And IIUC from your previous message,
> you've "already agreed to change setuptools to default this option to
> only allow downloads from the same host as its index URL, in a future
> release". So I think everyone is roughly on the same page about where we
> should be headed.

I've never and I never will support a proposal that removes the end users ability to install from a non PyPI source when requested to do so. Considering I operate a non PyPI source i'm not sure how this idea started.

> There is disagreement about how to make that work. My point is that I
> don't think PyPI publishing scraped-from-metadata external links on the
> simple/ index specifically, in perpetuity, is necessary or even
> beneficial to that future state.
>>> A) Leave external links in the PyPI simple index, but migrate the major
>>> tools to not use external links by default (i.e. Philip's plan to make
>>> allow-hosts=pypi the default in a future setuptools), with an option to
>>> turn them back on.
>> I don't know who has proposed this option, but it's not me.  You seem
>> to be confusing external links and HTML-scraped links (rel=""
>> attributed links in /simple).
> No, I'm not confusing those. All I'm referring to here is where you said
> you've "already agreed to change setuptools to default [allow-hosts] to
> only allow downloads from the same host as its index URL, in a future
> release." Did I not characterize that accurately?
>> I was the first person to propose disabling HTML-scraped links from
>> PyPI *ASAP*.  I still want them gone.  That won't require tool
>> changes, it just requires a rollout plan.  Holger has one, let's work
>> on that.
> Fully agreed. I understand from Holger that he would like his PEP to
> also discuss the rough plan beyond just disabling rel-link HTML
> scraping, for how to get to a point where the tools don't follow
> off-PyPI links at all by default. This second stage is what I'm talking
> about.
>> The second thing I proposed is that new tools be developed to *assist*
>> package authors in moving their files onto PyPI, so that future tool
>> changes wouldn't result in widespread instances of people needing to
>> set their tools to insecure settings just to get anything done.  We
>> need to get people's files moving onto PyPI *first*, in order to make
>> changing the tool defaults practical.
> Totally agreed that such tools could be useful, I should have included
> that point explicitly in my summary.
>> The *only* thing I object to is the part where some people want to ban
>> external links from /simple, always and forever, regardless of the
>> package authors' choice in the matter.
> I think the question of external links in /simple is causing far more
> heat than it's worth (from all sides), because it's fundamentally an
> implementation detail, not an end in itself.  Discussing the pros and
> cons of this implementation detail is more or less what rest is all about.
>>> B) Do a second PyPI migration, again with a per-package toggle and
>>> package owners in control, to a "no external links in simple index" setting.
>>> Consider for a moment how similar the end state here is with either A or
>>> B. In either case, by default users install only from PyPI, but by
>>> providing a special option they can install from some external source.
>>> (In B, that special option would be something like --find-links with a
>>> URL). In either case, we can continue to allow packages to register
>>> themselves on PyPI, be found in searches, etc, without uploading release
>>> files to PyPI if they prefer not to; they'll just have to provide
>>> special installation instructions to their users in that case.
>> Not true: approach B means that you won't know what values to pass to
>> the option.
> You say below that "nobody has proposed a 'trust everything' flag." If
> there is no "trust everything" flag, then it seems to me that with
> either option A or option B the user needs to specify what they intend
> to trust. I.e. if you make the default value of allow-hosts the index
> url host, as you said you plan to do at some point, users would need to
> override it with the hosts they want to allow.
> It seems like maybe what you are wanting is automatically-discoverable
> installation from externally-hosted files? I.e. that I could say
> "easy_install Foo --allow-external", without needing to know any
> specific external url for Foo?
> This is what I was characterizing as a "trust everything" flag, but on
> reflection I don't think I have any problem with that. I do think that:
> 1) external release-file URLs should be explicitly nominated by the
> package owner, not automatically sucked out of text metadata.
> 2) (After a suitable package-owner-controlled migration) those external
> links should live at a new separate (machine-readable) endpoint, not the
> existing /simple index. This has two benefits: a) even tools that exist
> today eventually gain the benefit of safer-by-default installations, and
> b) it's simpler and more reliable for future tools to distinguish
> between internal and external release file links.
>> It's also confused about an important point.  All the links that
>> appear in /simple are *already* completely under the package author's
>> control.  No new switches are required to remove external links - you
>> can simply remove them from your releases' descriptions.  This process
>> could be made more transparent or easy, sure -- but it's a mistake to
>> say that this is granting the package owners control that they don't
>> already have.
> This is partly true. An explicit flag grants package owners more control
> in that right now they don't have a choice about whether external links
> to tarballs in their long_description automatically get sucked into the
> simple index. This is not hypothetical; even if there were no rel-link
> scraping, I've had cases where package owners have complained to me
> about pip installing an RC tarball they had linked directly from their
> long-description, not intending it to be auto-installable.
> I think it would be preferable if in the future package owners wouldn't
> need to be careful what release-file links they might place in their
> long_description, and release files would be only explicitly nominated.
> I think the current "automatically suck in links to simple/" behavior is
> only useful as a backwards-compatibility hack, which is why I think an
> explicit switch to disable it (on by default for newly-registered
> projects, slowly, gently, carefully migrated to on for existing
> projects) is better than keeping this link-scraping behavior
> indefinitely for all projects and asking package owners to clean up
> their long-descriptions.
>> What they lack control over is the rel="" attributes, short of
>> removing those links entirely.  That's why I've proposed having a
>> switch for that , as reflected in Holger's pre-PEP.
> I agree with this switch, but I think there is more benefit than cost in
> extending the concept to all automatically-sucked-in external links.
>>> 1) With B, we can provide a gentler migration for package owners, where
>>> they are in control of when the switch happens.
>>> 2) With B, all end users benefit from the new defaults, not only end
>>> users who update to the latest and greatest tools.
>>> 3) With B (and probably some forms of A as well), end users clearly
>>> state which external sources they would like to trust and install from,
>>> rather than having a global "trust everything!" flag, which is less
>>> secure and less sensible.
>> These 3 statements all mischaracterize things substantially, because
>> none of those benefits are exclusive to A, and nobody has proposed a
>> "trust everything" flag.  
> You're right that item 1 is not technically exclusive to B, although I
> think B makes it much easier and simpler for package owners. "Just flip
> a switch and done" rather than "Go clean up all your package metadata
> including all past releases, or trust this tool we built to go editing
> all your release metadata for you." I'm not even sure how that
> hypothetical tool would work - what exactly would it do to automatically
> clean up a link to an external tarball that it finds in the
> long_description of a release from three years ago? Just remove it? What
> if the package owner actually wants that link there for human use?
>> Removing rel="" attributes also benefits
>> everyone right away, *without* new tools.
> Sure, and I'm fully in support of that being the first stage.
> Carl
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org
> http://mail.python.org/mailman/listinfo/catalog-sig

Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20130312/36dbbc27/attachment.pgp>

More information about the Catalog-SIG mailing list