[Distutils] PEP470 installation security problems

Donald Stufft donald at stufft.io
Wed Oct 8 09:47:58 CEST 2014

> On Oct 8, 2014, at 3:17 AM, holger krekel <holger at merlinux.eu> wrote:
> On Tue, Oct 07, 2014 at 08:00 -0400, Donald Stufft wrote:
>>> On Oct 7, 2014, at 6:09 AM, holger krekel <holger at merlinux.eu> wrote:
>>>> I had thought of similar things, and my reasons for not using an <a
>>>> href> and instead using a meta tag and for removing the old URLs
>>>> instead of just making this in addition to is:
>>>> 1. I don’t *want* users of older versions of pip/easy_install to
>>>> implicitly be fetching these things, they should be able to opt in as
>>>> well and indeed all the mechanisms exist in pip/easy_install for them
>>>> to already do so. The only thing that doesn’t exist is the discovery
>>>> mechanism.
>>> I think it's better to generally avoid deliberately breaking things.
>>> Things break enough even when we don't intend them to.
>>> IOW, Pypi should IMO aim to preserve working with as many client side
>>> scenarios as possible -- while adding things and improving for newer
>>> versions of clients.
>> And here I think is where the crux of our disagreement lies I think.
>> I think that PyPI should preserve working with as many client side
>> scenarios as possible, except where there is good reason to do so. 
>> I believe the fact that the vast bulk of the cases we’d be breaking are
>> people who are silently, and often unknowingly, being directed to
>> download some code over unauthenticated channels is a very good reason
>> to break those cases. Especially given the fact that there is a fairly
>> trivial work around for people who want to restore that behavior.
>> In a way this is similar to switching Python to enforcing TLS verification
>> by default, which afaik Guido has blessed even for 2.7 assuming that there
>> is a sane way to restore the default behavior and configure it.
> Are you saying that PEP470's breaking of backard compatibility is 
> deliberate and helps to defend against MITM attacks during installation?
> That might be true although i note that hacked servers (see also: bash,
> ssl) are much more common than MITM attacks and a hacked server can do
> SSL just fine.


> In any case, I see two security related downsides of PEP470, one of
> them severe.
> For one, current multi-index operations are riskier than PEP438's
> validated external release file urls.  Because currently you only need
> to trust pypi.python.org has not been hacked but with PEP470 you
> need to trust the integrity of the external site as well.
> IIUC you and Nick think this is acceptable because people deliberately
> make that choice by supplying an explicit option to use the external
> index, right?  If so i think the PEP should also be clear on the fact that
> Pip/pypi's external repo support is far inferior to typical linux repos
> because release files are not signed etc.

I might agree with you, except an important consideration of a security
feature is, “Does anybody even use this?”. Looking at adoption rates it’s
clear that practically nobody *does* use it. If it’s the most secure thing
in the world but 95+% of the traffic is using the insecure option, does it
really even matter if it’s secure?

To be honest, it’s *not* inferior to typical linux repos because in both
cases there is an online key you can compromise. If you compromise the
debian build fleet you can sign any release files you want, just like if
you compromise the Fastly servers and get the PyPI TLS key. You generally
do *not* get end to end verification on any Linux repo. The big benefit
of the linux model is that it enables untrusted mirrors whereas our
current model does not.

> Worse security problems loom with current multi-index ops like
> the --extra-index-url option which is advertised prominently in PEP470.
> You recommend to use it for private package indexes, but it can
> trivially compromise user machines: you register a private package name
> publically to pypi and add some malware release files, and can then
> infect all machines which execute an innocent "pip install
> --extra-index-url ...".  I think we conversed about this issue earlier but i
> don't see the PEP discussing it but rather it recommends using it
> without a direct call for caution (*).  I maintain this attack is more
> serious than MITM attacks for which you are even ready to break backward
> compat.

In the context of PEP 470 it’s giving another way for someone who has
registered a project on PyPI to host off of PyPI. In this sense there is
zero ability for someone else to come along and “override” the package name.
The ability to do this for private projects is really only relevant in that
by reusing that mechanism we have a single concept that users need to learn
instead of multiple concepts. “There should be one way to do it”.

> Donald, Nick, i am not against the goals of PEP470 per se but in its
> current form i see it rather causing damage.  When i explained to companies
> the dangers of pip multi-index operations they were rather alarmed and urged
> me to do something about it within the devpi context.  But PEP470 pretends
> all is fine and everybody should move to multi-index immediately -- that's
> premature at least if not outright endagering users even today because
> they take the advise in the draft PEP470 for granted because it comes
> from Nick and Donald who usually know what they are talking about.

This is really FUDish. Multi repository support *is* fine. If you have a private
project then you should likely claim the name on PyPI because even without
multi repository support all it would take is someone running pip on their
machine and forgetting to switch to your internal index to attack you too.

Can there be more improvements? Absolutely. However this particular problem
is an inherent issue with a central repository that anyone can upload too.
There are things we can do to make it less of a problem but it’s impossible
to ever completely solve it.

> At the very least we need to have clear discussion in the PEP about it
> and safer options for pip and PEP470 needs to MANDATE it for pip and
> maybe even for easy_install -- you could follow the devpi
> "pypi_whitelist" design to prevent mixed private/public package links
> and introduce a "--private-index-url" which means that pip would look
> first there and when it finds links for a name it would not consider
> other/public indexes unless the name is explicitely whitelisted.  I
> admit i am not happy about the usability of that but it gives a good
> secure default against public packages infecting private package installs.
> best,
> holger
> (*) I saw that PEP470 in a different section says "Installers SHOULD
> implement some mechanism for removing or otherwise disabling use of the
> default repository." but that's just a "SHOULD" and even if implemented 
> it will not fix fix things retro-actively for older pip/easy_install
> users -- but you claim fixing things for them is within the PEP470 scope
> above.

A PEP can’t really mandate anything to an installer and with PEP 438 I
think we found that mandating how things are implemented from on top easily
ends up being something that turns out worse in the long run. Pip has no
means to improve upon the UX of PEP 438 except by deciding we’re not going
to follow the PEP. We’d (I’d?) rather not just throw out what the PEPs say
so we generally want to follow things.

I have plans (and even a branch!) started to further enhance the multiple
repository support in pip. A lot of that is modeled after what yum and apt-get
has as far as options go. I am completely and unequivocally against things
which mandate much at all to what UX pip presents for these things because
I think we can better serve our users by being able to make our own UX decisions.
After my experiences with a mandated UX from a PEP I’m at the point where
personally I’ll ignore any such mandate in the future where I think there
is a better option for pip.

Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

More information about the Distutils-SIG mailing list