[Distutils] PEP 470 discussion, part 3

Richard Jones r1chardj0n3s at gmail.com
Fri Jul 25 15:13:12 CEST 2014


[apologies for the terrible quoting, gmail's magic failed today]

On 24 July 2014 17:41, Donald Stufft <donald at stufft.io> wrote:
> On July 24, 2014 at 7:26:11 AM, Richard Jones (r1chardj0n3s at gmail.com)
wrote:
>
> > This PEP proposes a potentially confusing break for both users and
packagers. In particular, during the transition there will be packages
which just disappear as far as users are concerned. In those cases users
will indeed need to learn that there is a /simple/ page and they will need
to view it in order to find the URL to add to their installation invocation
in some manner. Even once install tools start supporting the new mechanism,
users who lag (which as we all know are the vast majority) will run into
this.
>
> So we lengthen the transition time, gate it on an installer that has the
automatic hinting becoming the dominant version. We can pretty easily see
exactly what version of the tooling is being used to install stuff from
PyPI.

I would like to see the PEP have detail added around this transition and
how we will avoid packages vanishing. Perhaps we could have a versioned
/simple/ to allow transition to go more smoothly with monitoring activity
on the two versions? /simple-2/? /simpler/? :)

Additionally, it's been pointed out to me that I've been running on
assumptions about how multi-index support works. The algorithm that must be
implemented by installer tools needs to be spelled out in the PEP.


> Even ignoring the malicious possibility there is a probably greater
chance of accidental mistakes:
>
> - company sets up internal index using pip's multi-index support and
hosts various modules
> - someone quite innocently uploads something with the same name, never
version, to pypi
> - company installs now use that unknown code
>
> devpi avoids this (I would recommend it over multi-index for companies
anyway) by having a white list system for packages that might be pulled
from upstream that would clash with internal packages.
>
> As Nick's mentioned, a signing infrastructure - tied to the index
registration of a name - could solve this problem.
>
> Yes, those are two solutions, another solution is for PyPI to allow
registering a namespace, like dstufft.* and companies simply name all their
packages that. This isn’t a unique problem to this PEP though. This problem
exists anytime a company has an internal package that they do not want on
PyPI. It’s unlikely that any of those companies are using the external link
feature if that package is internal.

As i mentioned, using devpi solves this issue for companies hosting
internal indexes. Requiring companies to register names on a public index
to avoid collision has been raised a few times along the lines of "I hope
we don't have to register names on the public index to avoid this." :)


> > There still remains the usability issue of unsophisticated users
running into external indexes and needing to cope with that in one of a
myriad of ways as evidenced by the PEP. One solution proposed and refined
at the EuroPython gathering today has PyPI caching packages from external
indexes *for packages registered with PyPI*. That is: a requirement of
registering your package (and external index URL) with PyPI is that you
grant PyPI permission to cache packages from your index in the central
index - a scenario that is ideal for users. Organisations not wishing to do
that understand that they're the ones causing the pain for users.
>
> We can’t cache the packages which aren’t currently hosted on PyPI. Not in
an automatic fashion anyways. We’d need to ensure that their license allows
us to do so. The PyPI ToS ensures this when they upload but if they never
upload then they’ve never agreed to the ToS for that artifact.

I didn't state it clearly: this would be opt-in with the project granting
PyPI permission to perform this caching. Their option is to not do so and
simply not have a listing on PyPI.

> > An extension of this proposal is quite elegant; to reduce the pain of
migration from the current approach to the new, we implement that caching
right now, using the current simple index scraping. This ensures the
packages are available to all clients throughout the transition period.
>
> As said above, we can’t legally do this automatically, we’d need to
ensure that there is a license that grants us distribution rights.

A variation on the above two ideas is to just record the *link* to the
externally-hosted file from PyPI, rather than that file's content. It is
more error-prone, but avoids issues of file ownership.


      Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/distutils-sig/attachments/20140725/a7c3a6bf/attachment.html>


More information about the Distutils-SIG mailing list