[Distutils] PEP 470, round 4 - Using Multi Repository Support for External to PyPI Package File Hosting

Donald Stufft donald at stufft.io
Tue Oct 7 14:00:41 CEST 2014


> On Oct 7, 2014, at 6:09 AM, holger krekel <holger at merlinux.eu> wrote:
> 
> On Fri, Oct 03, 2014 at 15:08 -0400, Donald Stufft wrote:
>>> On Oct 3, 2014, at 2:28 PM, holger krekel <holger at merlinux.eu> wrote:
>>> 
>>> On Sat, Oct 04, 2014 at 00:24 +1000, Nick Coghlan wrote:
>>>> On 3 October 2014 22:02, Donald Stufft <donald at stufft.io> wrote:
>>>> 
>>>>> As far as simplication goes, I don't believe it simplifies the implementation
>>>>> of PyPI at all, it just shuffles things around and creates work on my part
>>>>> in order to get PyPI supporting the new stuff. It does however let installers
>>>>> become simpler and it enables installers to present accurate error information
>>>>> that actually helps determine the root cause of a failure instead of the
>>>>> current silent failure with a confusing error message model.
>>>>> 
>>>>> I look forward to your suggestions, but I'm not hopeful. I've been thus far
>>>>> unable to determine a way to improve the current solution in a way that isn't
>>>>> just papering over one problem without solving the fundamental issue.
>>>> 
>>>> Donald's perspective here matches my own. 
>>> 
>>> I don't see the "the fundamental issue" that PEP470 tries to solve.
>>> The first para of the abstract says it wants to substitute the existing
>>> mechanism for registering external indexes with another one.  It doesn't
>>> say why.  And it doesn't say why this can't be done in a backward
>>> compatible manner which would be preferable (i hope we agree there).
>> 
>> The fundamental issue is that PyPI is really two things, an index and a
>> repository. Currently these two roles are blurred and that lack of distinction
>> causes problems for both end users and authors and those problems create a
>> certain animosity towards people not wanting to use PyPI as their repository.
>> To this aim end users should be aware when they are installing things from
>> a repository other than PyPI and they should also be aware when doing so
>> is unsafe on the wire.
>> 
>> PEP 438 solves this problem. End users opt in to using a repository other than
>> PyPI. However It is my belief that the pain of doing so has outweighed the
>> benefits of PEP 438. 
> 
> Well, the main benefit of PEP438 was that it removed random crawling for
> some 90% of the packages on the package index, speeding up and making
> installs more reliable.  And it did that without breaking backward
> compatibility.  And I think PEP470 could achieve its goals this way too.

Sorry, I mean the main benefit with regards to projects that are hosted
externally. PEP 438 had tremendous benefit for cleaning up a ton of projects
which were not hosted externally and had links which existed for nothing more
than to slow things down and make things unsafe.

> 
>> Thus PEP 470 attempts to "go back to the drawing board"
>> and questions the mechanism for hosting on an alternative repository all
>> together.
>> 
>>> 
>>> And because the PEP doesn't precisely say what "fundamental issue"
>>> it solves it's a bit hard to present an alternative.  If it's about
>>> focusing on "multi-repository operations" and simplifying installer UI
>>> it could be done with full backward compat:
>>> 
>>> - add PyPI maintainer UI to add external indexes along with a message
>> 
>> Ok, this is part of PEP 470 too.
>> 
>>> 
>>> - change pip to disallow crawling to an external index it finds
>>> but rather present a message that you need to add the index 
>>> manually to your installer invocation. (pip already finds external
>>> crawl URLs and it can also find the "new" ones - no need for
>>> any breakage).
>> 
>> I had thought of similar things, and my reasons for not using an <a
>> href> and instead using a meta tag and for removing the old URLs
>> instead of just making this in addition to is:
>> 
>> 1. I don’t *want* users of older versions of pip/easy_install to
>> implicitly be fetching these things, they should be able to opt in as
>> well and indeed all the mechanisms exist in pip/easy_install for them
>> to already do so. The only thing that doesn’t exist is the discovery
>> mechanism.
> 
> I think it's better to generally avoid deliberately breaking things.
> Things break enough even when we don't intend them to.
> 
> IOW, Pypi should IMO aim to preserve working with as many client side
> scenarios as possible -- while adding things and improving for newer
> versions of clients.

And here I think is where the crux of our disagreement lies I think.

I think that PyPI should preserve working with as many client side
scenarios as possible, except where there is good reason to do so. I
believe the fact that the vast bulk of the cases we’d be breaking are
people who are silently, and often unknowingly, being directed to
download some code over unauthenticated channels is a very good reason
to break those cases. Especially given the fact that there is a fairly
trivial work around for people who want to restore that behavior.

In a way this is similar to switching Python to enforcing TLS verification
by default, which afaik Guido has blessed even for 2.7 assuming that there
is a sane way to restore the default behavior and configure it.

> 
>> 2. This doesn’t actually prevent breakage, it just links the breakage
>> to the version of pip/easy_install someone is using at the cost that
>> people with older clients are implicitly fetching things, some of
>> which may or may not be safe.
> 
> I am not sure i follow here, sorry.  There are two things the PEP does:
> 
> 1. remove "registered verified external links"
> 
> 2. support recording external indexes for a project
> 
> The first could be done without breakage except for the users and maintainers
> of that feature -- i take it we are still talking about just a few thousand
> client side uses and 60 project maintainers, right?
> 
> The second could be done without breakage alltogether i think:  at one
> time all external urls are auto-registered as external indexes 
> and they are presented on the simple page with some meta information
> that does not confuse older pips/easy_installs.  Newer pips/easy_installs
> can then provide nice error messages.  Older pips can continue to use
> the PEP438 options.  And easy install can continue to work.
> 
>> Overall I think the goal of not breaking things is a good one, however PyPI
>> isn’t a versioned thing where people can limit what version of things they run.
>> It’s important just from a maintenance aspect to be able to deprecate and
>> remove things over time. This will break things for people depending on those
>> things of course, so it’s always a balancing act about deciding *when* exactly
>> to remove something. I think that this is a good time to remove this particular
>> thing because the core functionality of it’s replacement has existed for a long
>> time, the actual use of the feature is quite low, and leaving it in presents an
>> issue with usability and security.
> 
> I agree that removing features and functionality is a good thing.
> But i maintain PEP470 could do it without breaking things.
> 

It absolutely *could*, but as described above, I think it’s a better idea to break
things in this case.

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



More information about the Distutils-SIG mailing list