[Distutils] some questions about PEP470
holger at merlinux.eu
Sat Oct 11 08:31:48 CEST 2014
many thanks for answering. A few follow up questions inline.
On Thu, Oct 09, 2014 at 13:40 -0400, Donald Stufft wrote:
> > On Oct 9, 2014, at 12:41 PM, holger krekel <holger at merlinux.eu> wrote:
> > Numbers of users affected
> > ---------------------------------
> > Do i see it right that the PEP470 changes would mean about 6-7 thousand
> > users (per day) need to change their installation options to use
> > "--extra-index-url"? If not, how many? Is there a monthly figure?
> It’s impossible to couch this in terms of “users” because we have no way
> of correlating what we see on the PyPI side with users. On the single day
> I selected to look at the logs (which was more or less the day before the
> day I was compounding numbers) there were 6.6k total unique IP addresses
> that hit a /simple/ page which belonged to one of the affected projects.
> Beyond knowing how many IP addresses it’s difficult to determine how that
> correlates into users, that could be a single user with 6.6k different EC2
> machines, or it could be 6.6k individual users (or even more than that if
> there is a transparent proxy at play). In all likelihood it is not a single
> user and it is not 6.6k users but somewhere in between.
> Important to point out that this number also includes people spinning up
> bandersnatch mirrors, devpi mirrors, or any other automated fetching of
> the /simple/ page for reasons other than “I’d like to install this project”.
I understand it's hard to get to somewhat sensible numbers and the
number of unique IPs is probably only an upper bound. devpi and bandersnatch
make even that fuzzy because more than one user may be behind each such
instance. Anyway, can you provide a monthly number of unique IP addresses on
the simple pages on projects with external links?
> > And that the affected users can only do that if the respective
> > maintainers of the projects offer an external index (or re-upload to PyPI)?
> No and Yes.
> Wherever pip/easy_install are currently finding the download from can serve
> as the external index. This likely won’t be the most efficient repository
> since often times these are regular web pages which have other content and
> the like but it won’t be any worse than it is currently. For instance you
> can take a look at https://bpaste.net/show/5a83985ad2e6 to see using the
> current page as a find-links repository with pip.
How can affected users discover they need to use this particular option
and URL if they use today's pip/easy_install versions and a post PEP470 PyPI?
> > Do i see it right that up to a 1000 maintainers need to act and offer an
> > external index if they want to keep their projects properly installable?
> If their project is already installable, then they already have something
> which is usable as either a simple or a find-links repository. The only
> action required on their part is if they want the discovery affordances
> in this PEP they would need to tell PyPI that.
Is this true also for the (small but still) set of maintainers who
registered external links with checksums?
If maintainers don't act, will using a post PEP470 released pip help
the users in any way?
> > I've understood you made these two statements during the discussion:
> > - PEP438 caused bad UI for dealing with pypi-external links --
> > many people are confused by it and we thus need to fix it.
> > - PEP470 breaking backward compatibility for pypi-external links is
> > not a big deal because it affects only a tiny fraction of the users.
> > Could you choose which one of them you consider is true?
> I consider them both to be true.
> The PEP 438 UX is confusing, out of the people who have had to use it I
> have seem a fairly high percentage of those completely confused by it. It,
> especially right when pip 1.5 was released, was one of our most reported
> issues. The total number of people who need to use it has gone down over
> time, however I still believe that percentage wise most people who need to
> use it are confused by it.
> I do not believe that PEP 470 breaking backwards compatability for pypi-external
> links to be a terrible burden because it only affects a small percentage of the
> total users of PyPI.
> I think perhaps the reason you think both of them can’t be true is you’re
> assuming that I’m talking about percentages of the same total population?
Yes, i was assuming that for both statements the same basis group was used.
So i understand know you are saying overall very few people depend on
external links but out of those who do, many are confused and annoyed
about how it works.
Will the people who suffered from the current external linking options
be the same ones who could be affected by backward compatibility issues
(i.e. commands which now work can fail with a post-PEP470 PyPI server)?
personal side question: do i remember correctly that when we discussed
PEP438 you pushed for the current set of behaviours wrt to external
links while i tried to keep it simpler because you put higher priority
on protection against MITM attacks?
> > Recommendation of "--extra-index-url"
> > --------------------------------------
> > In your mind and forgetting about PEP470, in what situations exactly is
> > "pip install --extra-index-url" a safe option for users?
> The answer to this isn’t really related to —extra-index-url, ``pip install foo``
> is “safe” (given the threat model we operate under) if, and only if, you trust
> the operators of all of the repositories you have configured (by default, via
> —index-url, via —extra-index-url, via —find-links, and via —process-dependency-links),
> to give you the correct files for “foo”. How the repositories have come to be
> configured isn’t particularly meaningful.
I understand that as a fairly generic security statement. But I was trying to
rather ask about use cases and scenarios where precisely the
--extra-index-url option is useful and to be recommended.
I'd be grateful if Nick or you could still describe use cases,
especially outside PEP470 external links context (the option existed
before so i presume there must be some use cases).
> > Interpretation of external link usage
> > --------------------------------------------
> > In the main rationale you say:
> > "While a large number of projects did ultimately decide to upload to
> > PyPI, some of them did so only because the UX around what PEP 438 was so
> > bad that they felt forced to do so."
> > Could you provide some tractable background (not just your strong opinion)
> > for this interpretation? Why can it not be that people nowadays just
> > prefer to upload to PyPI without even considering alternative options?
> Well Stefan had voiced that complaint last time that he felt we were trying
> to force him to upload to PyPI by making the UX so bad. I’ve had a few other
> people say similar things to me in private.
I can sympathize. In fact, I think we didn't deliver the upload tools
that we outlined with PEP438, particularly registration of externally
verified links. My bad as well.
More information about the Distutils-SIG