[Distutils] PEP draft on PyPI/pip package signing

Tue Jul 29 07:01:53 CEST 2014

On July 28, 2014 at 11:43:08 PM, Nick Coghlan (ncoghlan at gmail.com) wrote:
> 
> 1. SSL-independent authentication of the link from PyPI to the end
> user. PEP 458 largely covers this if you drop the "claimed" role. (We
> still have the "root key" management problem, but at least that's
> relatively narrow in scope)
> 2. Preventing a compromise of PyPI. This is just normal web app
> perimeter defence, and generally account security management. 2FA
> proposals fit into this. They're largely a matter for the PyPI
> development team and the PSF infrastructure team, rather than needing
> broad collaboration through the PEP process.
> 3. *Recovering* from a compromise of PyPI. Assuming that an attack has
> happened, and PyPI *has* been compromised, how do we identify which
> packages were compromised and remove them, reverting PyPI to a known
> good state. The "whole system" snapshot approach of TUF can
> potentially help here, since it makes it substantially more difficult
> to go back and surreptitiously modify past snapshots.
> 
> To be honest, I'm become more convinced over time that this is the
> right approach, particularly as we're contemplating the idea of adding
> a build farm directly to PyPI at some point in the future. At that
> point, PyPI emphatically *is* the publisher, and the link to be
> secured is the one from PyPI to the end user, just as the link being
> secured in the Linux distro case is the one from the distro build
> system to the end user, not from the package maintainer (let alone the
> upstream project).

I've not completely given up on the idea of E2E validation, but I do worry a
lot about if it's actually useful or if it's just feel good snake oil. As Nick
mentioned the biggest problem is whether or not people are going to use it at
all, and if they do are they going to secure that in a way that it actually
provides significant value.

There is some benefit, even if people do not properly secure the keys, just in
that in order to do a complete compromise of PyPI you have to compromise all
of the author's too. However it's likely that a significant number of people
will never use a key unless we force them to. In that case you can probably
daisy chain a compromise to everyone by compromising the ones you can, and
using those compromised packages to compromise the remaining people who are
installing those packages. Even if you can't, it's likely that even a 75%
compromise is going to be sufficiently bad that the remaining 25% isn't
particularly meaningful (and I think that 25% use is extremely generous).

Nick also pointed out the problem we have once we have a build farm on PyPI.
One of the problems with Wheels is that it requires you to have access to
a large number of build machines. You'll need 1-2 Windows boxes, an OSX box,
and potentially a variety of Linux boxes. This is a lot of infra for people
to maintain, especially for small one person FOSS projects. A PyPI run build
farm is a great way to ease this burden... however it mandates that a PyPI
owned machine has a key that is able to publish for those projects. In an E2E
scheme this is a super valuable target because it represents a central machine
that is connected to the Internet, has a decrypted key in memory, and is able
to sign for a lot of packages.

When you look at all of this, I think you have to question if E2E is actually
even possible for us in a meaningful way. If it's not possible for us in a
meaningful way, then why should we pay the cost and why should we pass that
cost on to our end users.

Now, if you take E2E of the table (and I haven't done that, but I think about
it often) you can start looking at radically simpler proposals. Right now we
rely on the security of TLS. TLS obviously has a lot of problems that it would
be nicer to solve. If we take E2E off the table then we could implement signing
using only online keys on PyPI itself (Possibly with an offline key as the
trust root just to make revocations and the like simpler). This would eliminate
the need to trust anything but the machine that has that particular signing
key. This could then have a higher level of protect placed around it in order
to protect this key more. This isn't a fully fleshed out thing by any means,
but it's a possiblity for a much simpler proposal that also includes trust of
mirrors, breaks through proxies, corps with MITM connections etc.

More or less I agree that at least looking at the 3 points that Nick mentioned
are really good ideas and in my mind will be far more constructive than more
E2E proposals. In my mind if we're going to do E2E TUF is the base line that
a project needs to do better than in either security or usability for the end
user and unfortunately this proposal has worse security and it requires the
same sort of things from the end user and it primarily makes it easier to
implement.

-- 
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA