[Catalog-sig] [DRAFT] Proposal for fixing PyPI/pip security

Giovanni Bajo rasky at develer.com
Mon Feb 11 20:26:31 CET 2013


Il giorno 10/feb/2013, alle ore 23:20, Jesse Noller <jnoller at gmail.com> ha scritto:

> 
> 
> On Feb 10, 2013, at 7:54 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> 
>> On Sun, Feb 10, 2013 at 10:36 PM, Jannis Leidel <jannis at leidel.info> wrote:
>>> 
>>> On 10.02.2013, at 05:44, Nick Coghlan <ncoghlan at gmail.com> wrote:
>>> 
>>>> On Sun, Feb 10, 2013 at 7:23 AM, Giovanni Bajo <rasky at develer.com> wrote:
>>>>> Hello,
>>>>> 
>>>>> my proposal for fixing PyPI and pip security is here:
>>>>> https://docs.google.com/a/develer.com/document/d/1DgQdDCZY5LiTY5mvfxVVE4MTWiaqIGccK3QCUI8np4k/edit#
>>>>> 
>>>>> I tried to sum up the discussions we had here last week, elaborating on Heimes' proposal by simplifying it where I thought the additional steps wouldn't guarantee additional security. At this point, the proposal does not include a central, uber-master online GPG signing key to be stored on PyPI, which is IMO quite hard to handle correctly.
>>>> 
>>>> I think the parts related to improving the HTTPS/SSL based security
>>>> are solid, but for the other aspects of secure updates, integrating
>>>> TUF (https://www.updateframework.com/) into the PyPI based
>>>> distribution infrastructure sounds like the best available option for
>>>> enhancing the end-to-end integrity checking. TUF has a comparatively
>>>> well-developed threat model, and systematically covers many of the
>>>> attack vectors discussed in the past few day (including provision of
>>>> old, known vulnerable, versions).
>>> 
>>> Would you mind explaining why TUF is good?
>> 
>> The main benefit in my mind is that it isn't a from-scratch design of
>> a secure update infrastructure. Instead, it's a project that was
>> started in order to resolve some security holes found in Tor's already
>> robust automatic update mechanism, then proceeded from there into
>> updates to yum, yast, apt, etc (i.e. the distro update mechanisms that
>> are vetted by the security teams of the various Linux vendors). The
>> fact Geremy Condra is involved in TUF also counts for a lot with me
>> (as I suspect it would for many people that have heard Geremy talk
>> about security issues in Python).
>> 
>> However, the design itself also seems sensible, and is able to provide
>> its security guarantees even if you're *not* using SSL certs to
>> protect the in-flight traffic (thus meaning that the SSL
>> infrastructure in the near term will become a matter of providing
>> defence-in-depth, rather than being a required part of the security
>> scheme).
>> 
>> I trust our collective ability to make TUF sufficiently easy to use as
>> part of Python's packaging infrastructure a *lot* more than I trust
>> our collective ability to come up with a from-scratch distribution
>> scheme that is both usable *and* secure.
>> 
>>> The site doesn't seem to work for me right now.
>> 
>> D'oh, looks like their domain wasn't set to auto-renew :(
>> 
>> Cheers,
>> Nick.
>> 
> 
> Feedback from Geremy below:
> 
> OK, so, I think there's a lot of stuff conflated here. It'll probably help to simplify things if we decouple them.
> 
> First, the point about serving metadata over a secure channel and data over a cheap one is right on. Given the size of your metadata versus actual data, maintaining a central metadata service but not caring about where/how data is hosted is the right way to go. Note that that channel doesn't have to be SSL- a verifying cert on device would still give you everything you needed.
> 
> Second, decouple the transport-level problem from verifying code. SSL is good, but it doesn't provide end to end security, which is what you care about here. A good alternative is the Android model, with per developer keys- it keeps attribution with code and clients can verify that the key is correct based on the current and possibly previous signed metadata bundle.
> 
The Android model is the self-signed model (aka SSH model), where an user is presented with a self-signed certificate and just needs to press Yes without verifying. In fact, Android doesn't even ask the question to the user, and it assumes that everything you download from the app store is "correct". So in a way, the main protection it offers is that it's not possible for another user to publish an application with the same name on the Android market, but AFAIK if I download an application, insert a malware, sign it myself, and directly send to a victim, the victim will have no way to realize the application has a wrong signature (unless it's been installed before).

The Android model works well for their use cases, but we can do more than that. In our case, users install packages with their unique package name, so we are able to make sure pip will refuse to install a package named "django", even if it comes from a different source, because it will always be able to double check with PyPI.

Plus, our advanced users might want to have a fixed trust list, never updated from PyPI, which is allowed in my design.
> It's also very understandable for developers, who we found sometimes got confused by TUFs many keys, integrates well with managed hsm solutions like verisign's, and can integrate with scm's for really nice commit-level authentication.
> 
To clarify, the above statements also apply to the design I propose (which is actually the one that is already present, with --sign). It doesn't apply to TUF.

It's interesting that Geremy says that the TUF model is confusing -- I have never used it, but it in fact looked a little too complicated for our average package maintainer. If we were worried about making them create a single GPG key with a single command line, here we're talking of handling 4 different keys (maybe 3 if timestamping key is handled by PyPI).
> The major attack left in this model is a compromise of the metadata server key, which is the problem TUF was designed to address. In practice, you are going to have a bad time if this happens, since TUF converts this compromise into a denial of service some period of time later. I'll have to think a bit about whether this is good enough for you, but I'm inclined to say no based on the difficulty of doing audits on so much output. Might just be best to simplify the system and keep this aspect hardened and trusted. Happy to talk more about that if you're interested.
> 
Sure, can he join the discussion? I'm also available for a chat, or a call.
-- 
Giovanni Bajo   ::  rasky at develer.com
Develer S.r.l.  ::  http://www.develer.com

My Blog: http://giovanni.bajo.it

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20130211/614144ac/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4346 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20130211/614144ac/attachment-0001.bin>


More information about the Catalog-SIG mailing list