On Oct 8, 2014, at 4:44 AM, holger krekel
wrote: On Wed, Oct 08, 2014 at 03:47 -0400, Donald Stufft wrote:
On Oct 8, 2014, at 3:17 AM, holger krekel
wrote: Worse security problems loom with current multi-index ops like the --extra-index-url option which is advertised prominently in PEP470. You recommend to use it for private package indexes, but it can trivially compromise user machines: you register a private package name publically to pypi and add some malware release files, and can then infect all machines which execute an innocent "pip install --extra-index-url ...". I think we conversed about this issue earlier but i don't see the PEP discussing it but rather it recommends using it without a direct call for caution (*). I maintain this attack is more serious than MITM attacks for which you are even ready to break backward compat. In the context of PEP 470 it’s giving another way for someone who has registered a project on PyPI to host off of PyPI. In this sense there is zero ability for someone else to come along and “override” the package name. The ability to do this for private projects is really only relevant in that by reusing that mechanism we have a single concept that users need to learn instead of multiple concepts. “There should be one way to do it”.
Donald, Nick, i am not against the goals of PEP470 per se but in its current form i see it rather causing damage. When i explained to companies the dangers of pip multi-index operations they were rather alarmed and urged me to do something about it within the devpi context. But PEP470 pretends all is fine and everybody should move to multi-index immediately -- that's premature at least if not outright endagering users even today because they take the advise in the draft PEP470 for granted because it comes from Nick and Donald who usually know what they are talking about.
This is really FUDish. Multi repository support *is* fine. If you have a private project then you should likely claim the name on PyPI because even without multi repository support all it would take is someone running pip on their machine and forgetting to switch to your internal index to attack you too.
I am sorry if raising the issue of private/public compromises sounds like FUD to you. From my experience it's a real attack vector. I talked about this at EP2014 (http://youtu.be/aNrrGf-uNUY?t=6m1s ) and people got back to me afterwards, surprised.
And I don't think you can successfully ask people in companies around the work to register private package names publically (let alone the issue of clashes etc.). Admit it, that's even more unlikely than peple using some PEP438 features :)
And yes, if someone forgets to set the private index he could still pull in malicious public links even with devpi or new pip options.
I think raising the issue is FUDish because it has nothing to do with using multi repository support for things that are registered on PyPI. The attack vector you’re describing isn’t possible at all for any project that is effected by PEP 470, which are projects which wish to register themselves in the PyPI index without using PyPI as their repository. The *only* reason using the multiple repository support for private hosting is relevant is because it’s the option for allowing people to have private repositories at all, and we can re-use that behavior with this. It’s not particularly relevant to PEP 470 unless you have a suggestion for another, better mechanism that can satisfy all of these use cases (and possibly more?). That being said, the things I have sketched out for pip includes the ability to have both a whitelist and a black list for each repository. I don’t however think that it’s the PEPs place to dictate how that looks (or even if it exists). I’m also not against adding another *SHOULD* saying that installers should implement some mechanism that allows for whitelisting or blacklisting which repository particular projects come from.
Can there be more improvements? Absolutely. However this particular problem is an inherent issue with a central repository that anyone can upload too. There are things we can do to make it less of a problem but it’s impossible to ever completely solve it.
Linux repos are totally different: their main index is a curated index and pypi's is a wiki. Thus merging links from a private index and the pypi wiki can trivially wreak havoc while putting malware into the central Debian or Redhat repo is very hard.
At the very least we need to have clear discussion in the PEP about it and safer options for pip and PEP470 needs to MANDATE it for pip and maybe even for easy_install -- you could follow the devpi "pypi_whitelist" design to prevent mixed private/public package links and introduce a "--private-index-url" which means that pip would look first there and when it finds links for a name it would not consider other/public indexes unless the name is explicitely whitelisted. I admit i am not happy about the usability of that but it gives a good secure default against public packages infecting private package installs.
best, holger
(*) I saw that PEP470 in a different section says "Installers SHOULD implement some mechanism for removing or otherwise disabling use of the default repository." but that's just a "SHOULD" and even if implemented it will not fix fix things retro-actively for older pip/easy_install users -- but you claim fixing things for them is within the PEP470 scope above.
A PEP can’t really mandate anything to an installer and with PEP 438 I think we found that mandating how things are implemented from on top easily ends up being something that turns out worse in the long run.
UI design is a delicate thing -- but i am sure you remember that you were involved in PEP438 and actually pushed for some UI that you are now criticising. I am a bit irritated but i understand that you probably all along wanted to push the processes towards the "multi-repo" idea. Please note that i am not against this in principle.
Absolutely. I don’t think that it was clear at the time that the PEP 438 UX would be as bad as it turned out to be. My take away from that isn’t so much that the people involved were bad at UXs but that codifying a UX into a PEP is a bad idea in general. With PEP 470 you can see this because even on the PyPI side I don’t dictate a UX in it. I spell out the API that I expect PyPI to adopt but I don’t mention what the UX looks like so that we can easily adjust it. I also don’t spell out what the UXs on the installers look like, against because It’s my belief now that dictating UX is a generally bad idea, instead I spell out what features an installer *should* have. This is all just lessons learned from trying to spell out a UX inside of a PEP, we did it, it didn’t work and when it didn’t work the fact it was in a PEP put us in a crappy situation of having to either write a whole new PEP (and possible recreate new UX issues) or start ignoring PEPs. For the record, both pip and easy_install already have mechanisms for disabling the default repository. In pip this is ``—no-index`` and in easy_install it’s not as easy but you can either override it with ``—index-url`` or use the ``—allow-hosts`` option to disallow PyPI.
Pip has no means to improve upon the UX of PEP 438 except by deciding we’re not going to follow the PEP. We’d (I’d?) rather not just throw out what the PEPs say so we generally want to follow things.
And that's a good thing, thanks! Given the importance of PyPI today in the python community, I think the way how PyPI interacts with tools and installers deserves PEPs.
Yea I agree, which is why I’m trying to figure out how to do PEPs without making them feel more like handcuffs than useful tools :)
I have plans (and even a branch!) started to further enhance the multiple repository support in pip. A lot of that is modeled after what yum and apt-get has as far as options go. I am completely and unequivocally against things which mandate much at all to what UX pip presents for these things because I think we can better serve our users by being able to make our own UX decisions. After my experiences with a mandated UX from a PEP I’m at the point where personally I’ll ignore any such mandate in the future where I think there is a better option for pip.
PEPs are a form of helping collaboration and growth in a community but certainly not the only way and, if done badly, can do more damage than good.
best, holger
--- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA