[Distutils] PEP draft on PyPI/pip package signing

Tue Jul 29 05:42:44 CEST 2014

On 29 July 2014 11:50, Ian Cordasco <graffatcolmingov at gmail.com> wrote:
> On Mon, Jul 28, 2014 at 8:12 PM, Giovanni Bajo <rasky at develer.com> wrote:
>> Il giorno 29/lug/2014, alle ore 02:39, Nick Coghlan <ncoghlan at gmail.com> ha
>> scritto:
>>> If your PEP defends against all the attacks TUF does, then it will be just
>>> as complicated as TUF. If it doesn't defend against all those attacks, then
>>> it offers even less justification for the complexity increase than TUF.
>>
>> 1) TUF isn’t designed for PyPI and pip. It’s a generic system meant for many
>> different scenarios, which is then adapted (with many different compromises)
>> to our use case. So you can’t really postulate that you absolutely need
>> something as complicated to get to the same level of defense.
>> 2) Security is never perfection. You might very well decide that the
>> increased level of security is not worth the complexity increase.
>>
>> My solution is far far simpler than TUF. To me, it’s a reasonable compromise
>> between complexity of implementation, time/costs required for all involved
>> parties, and decreased security.
>
> While there's a significant difference between the complexity of the
> proposals to implement, there's also a significant difference in
> complexity for the end users.
>
> On the one hand, to implement PEP458, would mean a great deal of work
> on those working on PyPI/Warehouse and pip, but it would have little
> (if any) end user implications or complications.
>
> Your PEP on the other hand, causes some instabilities (especially if
> PGP/GPG isn't installed, or if someone has hijacked the PATH) and will
> create only headaches for the users. They'll have to install GPG,
> generate a Key, upload the key, secure the key, and make sure they
> don't ever lose it. While there's less complexity for the
> implementers, there's much more for end users. We don't want to make
> packaging worse for users in exchange for a negligible (questionable?)
> amount of increased security.

Right. Properly securing signing keys is incredibly painful. Pushing
that burden onto package authors doesn't magically make it less
painful, it just distributes the pain to more people. The challenges
that PEP 458 waves away regarding "How do the PyPI admins manage the
root signing keys?" are exactly the same challenges that package
authors would face under either PEP, particularly when a project is
maintained by a distributed team with multiple people that can make
releases. Managing signing keys in distributed development in such a
way that they're actually worthy of trust is *hard*.

Attaching a signature to something is relatively meaningless if we
don't know how the signing key is itself being managed. If it isn't
handled carefully, then the signing process *itself* can become a
vulnerability, since difficulties signing a new release can prevent
the release of a new version. We actually face this problem with
CPython: we *cannot* quickly hand over the task of creating the binary
installers to new maintainers, as we have to obtain and communicate
the relevant information regarding the new signing keys. One of the
things Red Hat subscribers are paying for is the fact that we *do*
have a certified build system, so our signing key actually means
something. Other Linux distros are also paranoid about protecting
their signing keys, even if they're not in a position to go for full
security certifications.

You can get a feel for the complexity involved in doing signing key
management properly by looking at what's involved in running your own
Certificate Authority: http://pki.fedoraproject.org/wiki/PKI_Main_Page
(or see Ade's talk at Flock last year:
https://www.youtube.com/watch?v=OvAdCxvPjmM)

And that's the conundrum we face with end to end signing proposals;
they shift the responsibility for managing key integrity to folks that
by and large will fall into one of the following camps:

1. They have no idea what "key management" means, or why they should care
2. They have some idea what "key management" means, and, mistakenly,
think it's easy
3. The know exactly what "key management" means, and hence, quite
sensibly, don't want to have anything to do with it on a volunteer
driven project

Folks in categories 1 & 3 won't make use of end-to-end signing on
PyPI. Folks in category 2 might make use of it, but would be at
significant risk of not securing their signing keys properly (folks
that are appropriately paranoid about the difficulties of managing
signing keys safely are the ones that will say "no, I don't want to").

This is why end-to-end signing support isn't really a technology
problem, it's a "people and processes" problem. It's such a hard one
that even commercial software companies struggle to deal with it,
hence the rise of the app store model to protect the integrity of
mobile devices, while still allowing the use of applications from a
wide variety of developers. The Linux distros have *always* worked
that way - the signing keys are controlled as part of the distro build
infrastructure, not by the individual package maintainers. The package
maintainers just upload source packages, and the build system takes
care of the rest (including signing the result).

It is thus very, very tempting to declare the end-to-end signing
problem unsolvable in the general case, given the diverse range of
publishers that PyPI supports. If we *do* take that step, then the
threat model shifts, since we declare that yes, an attacker that fully
compromises PyPI *will* be able to masquerade as any publisher. At
that point, the security response shifts to focusing on:

1. SSL-independent authentication of the link from PyPI to the end
user. PEP 458 largely covers this if you drop the "claimed" role. (We
still have the "root key" management problem, but at least that's
relatively narrow in scope)
2. Preventing a compromise of PyPI. This is just normal web app
perimeter defence, and generally account security management. 2FA
proposals fit into this. They're largely a matter for the PyPI
development team and the PSF infrastructure team, rather than needing
broad collaboration through the PEP process.
3. *Recovering* from a compromise of PyPI. Assuming that an attack has
happened, and PyPI *has* been compromised, how do we identify which
packages were compromised and remove them, reverting PyPI to a known
good state. The "whole system" snapshot approach of TUF can
potentially help here, since it makes it substantially more difficult
to go back and surreptitiously modify past snapshots.

To be honest, I'm become more convinced over time that this is the
right approach, particularly as we're contemplating the idea of adding
a build farm directly to PyPI at some point in the future. At that
point, PyPI emphatically *is* the publisher, and the link to be
secured is the one from PyPI to the end user, just as the link being
secured in the Linux distro case is the one from the distro build
system to the end user, not from the package maintainer (let alone the
upstream project).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia