[Distutils] PEP470 installation security problems

holger krekel holger at merlinux.eu
Wed Oct 8 12:06:59 CEST 2014

On Wed, Oct 08, 2014 at 05:44 -0400, Donald Stufft wrote:
> > On Oct 8, 2014, at 4:44 AM, holger krekel <holger at merlinux.eu> wrote:
> > 
> > On Wed, Oct 08, 2014 at 03:47 -0400, Donald Stufft wrote:
> >>> On Oct 8, 2014, at 3:17 AM, holger krekel <holger at merlinux.eu> wrote:
> >>> Worse security problems loom with current multi-index ops like
> >>> the --extra-index-url option which is advertised prominently in PEP470.
> >>> You recommend to use it for private package indexes, but it can
> >>> trivially compromise user machines: you register a private package name
> >>> publically to pypi and add some malware release files, and can then
> >>> infect all machines which execute an innocent "pip install
> >>> --extra-index-url ...".  I think we conversed about this issue earlier but i
> >>> don't see the PEP discussing it but rather it recommends using it
> >>> without a direct call for caution (*).  I maintain this attack is more
> >>> serious than MITM attacks for which you are even ready to break backward
> >>> compat.
> >> 
> >> In the context of PEP 470 it’s giving another way for someone who has
> >> registered a project on PyPI to host off of PyPI. In this sense there is
> >> zero ability for someone else to come along and “override” the package name.
> >> The ability to do this for private projects is really only relevant in that
> >> by reusing that mechanism we have a single concept that users need to learn
> >> instead of multiple concepts. “There should be one way to do it”.
> >> 
> >>> 
> >>> Donald, Nick, i am not against the goals of PEP470 per se but in its
> >>> current form i see it rather causing damage.  When i explained to companies
> >>> the dangers of pip multi-index operations they were rather alarmed and urged
> >>> me to do something about it within the devpi context.  But PEP470 pretends
> >>> all is fine and everybody should move to multi-index immediately -- that's
> >>> premature at least if not outright endagering users even today because
> >>> they take the advise in the draft PEP470 for granted because it comes
> >>> from Nick and Donald who usually know what they are talking about.
> >> 
> >> This is really FUDish. Multi repository support *is* fine. If you have a private
> >> project then you should likely claim the name on PyPI because even without
> >> multi repository support all it would take is someone running pip on their
> >> machine and forgetting to switch to your internal index to attack you too.
> > 
> > I am sorry if raising the issue of private/public compromises sounds
> > like FUD to you.  From my experience it's a real attack vector.  I talked
> > about this at EP2014 (http://youtu.be/aNrrGf-uNUY?t=6m1s ) and people
> > got back to me afterwards, surprised.
> > 
> > And I don't think you can successfully ask people in companies
> > around the work to register private package names publically (let alone
> > the issue of clashes etc.).  Admit it, that's even more unlikely than
> > peple using some PEP438 features :)
> > 
> > And yes, if someone forgets to set the private index he could still pull
> > in malicious public links even with devpi or new pip options.
> I think raising the issue is FUDish because it has nothing to do with using
> multi repository support for things that are registered on PyPI. 

Well, the PEP has two central paragraphs motivating multi-index operations:

    The two common installer tools, pip and easy_install/setuptools, both
    support the concept of additional locations to search for files to
    satisify the installation requirements and have done so for many years.
    This means that there is no need to "phase" in a new flag or concept and
    the solution to installing a project from a repository other than PyPI
    will function regardless of how old (within reason) the end user's
    installer is. Not only has this concept existed in the Python tooling
    for some time, but it is a concept that exists across languages and even
    extending to the OS level with OS package tools almost universally using
    multiple repository support making it extremely likely that someone is
    already familar with the concept.

    Additionally, the multiple repository approach is a concept that is
    useful outside of the narrow scope of allowing projects which wish
    to be included on the index portion of PyPI but do not wish to
    utilize the repository portion of PyPI. This includes places where a
    company may wish to host a repository that contains their internal
    packages or where a project may wish to have multiple "channels" of
    releases, such as alpha, beta, release candidate, and final release.

and then it concretely suggests "--extra-index-url" and gives an example.
It does not say that this is only good if you are using private projects
that have a presence on PyPI.  It rather suggests multi-index is the thing 
to go for today, generally, does it not?

Given that PyPI is a wiki and Linux Distros are a curated index, i
insist it's dangerous to recommend to mix multiple indexes with pip if
you don't know quite exactly what you are doing.  Do you really disagree
on this?


> The attack
> vector you’re describing isn’t possible at all for any project that is effected
> by PEP 470, which are projects which wish to register themselves in the PyPI
> index without using PyPI as their repository.
> The *only* reason using the multiple repository support for private hosting is
> relevant is because it’s the option for allowing people to have private
> repositories at all, and we can re-use that behavior with this. It’s not particularly
> relevant to PEP 470 unless you have a suggestion for another, better mechanism
> that can satisfy all of these use cases (and possibly more?).
> That being said, the things I have sketched out for pip includes the ability to
> have both a whitelist and a black list for each repository. I don’t however think
> that it’s the PEPs place to dictate how that looks (or even if it exists).
> I’m also not against adding another *SHOULD* saying that installers should
> implement some mechanism that allows for whitelisting or blacklisting which
> repository particular projects come from.
> > 
> >> Can there be more improvements? Absolutely. However this particular problem
> >> is an inherent issue with a central repository that anyone can upload too.
> >> There are things we can do to make it less of a problem but it’s impossible
> >> to ever completely solve it.
> > 
> > Linux repos are totally different: their main index is a curated index
> > and pypi's is a wiki.  Thus merging links from a private index and 
> > the pypi wiki can trivially wreak havoc while putting malware into
> > the central Debian or Redhat repo is very hard.  
> > 
> >>> At the very least we need to have clear discussion in the PEP about it
> >>> and safer options for pip and PEP470 needs to MANDATE it for pip and
> >>> maybe even for easy_install -- you could follow the devpi
> >>> "pypi_whitelist" design to prevent mixed private/public package links
> >>> and introduce a "--private-index-url" which means that pip would look
> >>> first there and when it finds links for a name it would not consider
> >>> other/public indexes unless the name is explicitely whitelisted.  I
> >>> admit i am not happy about the usability of that but it gives a good
> >>> secure default against public packages infecting private package installs.
> >>> 
> >>> best,
> >>> holger
> >>> 
> >>> (*) I saw that PEP470 in a different section says "Installers SHOULD
> >>> implement some mechanism for removing or otherwise disabling use of the
> >>> default repository." but that's just a "SHOULD" and even if implemented 
> >>> it will not fix fix things retro-actively for older pip/easy_install
> >>> users -- but you claim fixing things for them is within the PEP470 scope
> >>> above.
> >>> 
> >> 
> >> A PEP can’t really mandate anything to an installer and with PEP 438 I
> >> think we found that mandating how things are implemented from on top easily
> >> ends up being something that turns out worse in the long run.
> > 
> > UI design is a delicate thing -- but i am sure you remember that
> > you were involved in PEP438 and actually pushed for some UI that you are
> > now criticising.  I am a bit irritated but i understand that you probably
> > all along wanted to push the processes towards the "multi-repo" idea.
> > Please note that i am not against this in principle.
> Absolutely. I don’t think that it was clear at the time that the PEP 438 UX
> would be as bad as it turned out to be. My take away from that isn’t so much
> that the people involved were bad at UXs but that codifying a UX into a PEP
> is a bad idea in general. With PEP 470 you can see this because even on the
> PyPI side I don’t dictate a UX in it. I spell out the API that I expect
> PyPI to adopt but I don’t mention what the UX looks like so that we can
> easily adjust it. I also don’t spell out what the UXs on the installers look
> like, against because It’s my belief now that dictating UX is a generally bad
> idea, instead I spell out what features an installer *should* have.
> This is all just lessons learned from trying to spell out a UX inside of a PEP,
> we did it, it didn’t work and when it didn’t work the fact it was in a PEP put
> us in a crappy situation of having to either write a whole new PEP (and possible
> recreate new UX issues) or start ignoring PEPs.
> For the record, both pip and easy_install already have mechanisms for disabling
> the default repository. In pip this is ``—no-index`` and in easy_install it’s
> not as easy but you can either override it with ``—index-url`` or use the
> ``—allow-hosts`` option to disallow PyPI.
> > 
> >> Pip has no means to improve upon the UX of PEP 438 except by deciding
> >> we’re not going to follow the PEP. We’d (I’d?) rather not just throw
> >> out what the PEPs say so we generally want to follow things.
> > 
> > And that's a good thing, thanks!  Given the importance of PyPI today in
> > the python community, I think the way how PyPI interacts with tools and
> > installers deserves PEPs.
> Yea I agree, which is why I’m trying to figure out how to do PEPs without
> making them feel more like handcuffs than useful tools :)
> > 
> >> I have plans (and even a branch!) started to further enhance the multiple
> >> repository support in pip. A lot of that is modeled after what yum and apt-get
> >> has as far as options go. I am completely and unequivocally against things
> >> which mandate much at all to what UX pip presents for these things because
> >> I think we can better serve our users by being able to make our own UX decisions.
> >> After my experiences with a mandated UX from a PEP I’m at the point where
> >> personally I’ll ignore any such mandate in the future where I think there
> >> is a better option for pip.
> > 
> > PEPs are a form of helping collaboration and growth in a community but
> > certainly not the only way and, if done badly, can do more damage than good.
> > 
> > best,
> > holger
> ---
> Donald Stufft
> PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

More information about the Distutils-SIG mailing list