[Distutils] Surviving a Compromise of PyPI - PEP 458 and 480

Donald Stufft donald at stufft.io
Fri Jan 2 14:48:43 CET 2015

> On Jan 2, 2015, at 7:45 AM, Paul Moore <p.f.moore at gmail.com> wrote:
> On 2 January 2015 at 11:21, Donald Stufft <donald at stufft.io> wrote:
>> To be clear, there is zero delay in being able to publish a new project, the
>> delay is between moving from a new project being validated by an online key
>> to an offline key.
> OK, got it.
> Although on the terminology front, I don't really understand what an
> "online key" and an "offline key" are. My mind translates them as
> "worse" and "better" from context, but that's about all. I'm not
> asking for an explanation (I can look that up) but as terms being
> encountered by the non-specialist, they contribute to the difficulty
> in reading the proposals. If there's any better way of naming these
> two types of keys, that would be accessible to the non-specialist,
> that would be a great help.
> (Actually, the other implication I read into "offline key" is
> "something that I, as the owner, have to take care of and manage" -
> and that scares me because I'm rubbish at managing keys or anything
> like that, basically anything more complex than a password - at least
> until we get to things like RSA keys and tokens that I use for "work"
> and am paid to manage as opposed to "hobby" stuff).

Hmm, I’m not sure if there’s really a better way of naming them. Those are
"standard" names for them and they are defined in the PEP under the Definitions
section [1].

For PEP 458 there is zero key management done by anyone other than the people
running PyPI. It is essentially replacing using TLS for verification with
using TUF for verficiation (which gets us a few wins, but still leaves a few
enhancements on the table that we get with the addition of PEP 480).

For PEP 480 authors have to manage *something*. What exactly that *something*
is, is a question for PEP 480. One (poor in my opinion) possibility is an RSA
key which means that authors will need to manage keeping that file around and
backed up and such. Another possiblility is a secondary password which is only
used when needing to sign something (so during uploads and the like).

>> The only real difference between validation levels is that
>> until it's been signed by an offline key then people installing that project
>> are vulnerable to someone compromising PyPI. This is because until the
>> delegation of project X to a specific developer has been signed the "chain of
>> trust" contains a key that is sitting on the harddrive of PyPI.
>> However, once a delegation of Project X _has_ been signed changing that
>> delegation would need waiting until the next time the delegations were signed
>> by the offline keys. This is because once a project is signed by an offline
>> key then all further changes to the delegation require offline signing.
> "Delegation" is another term that doesn't make immediate sense. I read
> it as "Confirm ownership" sort of, Again, it's not that I can't work
> out what the term means, but I don't get an immediate sense of the
> implications. Here, for example, it's not immediately clear whether
> delegation changes would be common or rare, or whether having them
> happen quickly would be important. (For example, if you're not
> available for a pip release, and we never bothered sharing the keys
> because it's easier just for you to have them, would we need a
> delegation change to do an emergency release?)
> Again, this isn't something that it's important to clarify for me here
> and now, but I would like to see the PEP updated to clarify this sort
> of issue in terms that are accessible to the layman.

This isn’t defined in the PEP.

In PEP 480 though, if I’m the only person who has been setup to be allowed
to sign for pip releases, then nobody else can release pip without intervention
from PyPI Administrators. That’s not entirely different than the current
situation where if I was the only person added as a maintainer to the pip
project on PyPI and someone else needed to do a release they couldn’t without
intervention from the PyPI administrators. IOW doing the required steps to
enable other people’s keys to sign would be part of adding them as maintainers
to that project.

>> In addition, this does not mean (I believe! we should verify this) that the
>> owner of a project cannot further delegate to other people without delay, since
>> they'll be able to sign that delegation with their own keys and won't require
>> intervention from PyPI.
> See above - implies to me that if the "owner" (in terms of keys rather
> than project structure) is unavailable, other project members may not
> be able to upload (rather than as now, when they can upload with the
> project's PyPI account password and/or standard password recovery
> processes to the project email address).

If the owner or someone they delegated to, yes.

>> So really it looks like this (abbreviated, not exactly, etc):
>> root (offline)
>> |- PyPI Admins (offline)
>>   |- "Unclamined" (online)
>>      |- New Project that hasn't yet been signed for by PyPI Admins
>>         (offline, owned by author)
>>   |- Existing Project that has already been signed for by PyPI Admins
>>      (offline, owned by author)
> I'm not keen on the term "unclaimed". All projects on PyPI right now
> are "unclaimed" by that usage, which is emphatically not what
> "unclaimed" means intuitively to me. Surely "pip" is "claimed" by the
> PyPA? Maybe "unverified" works as a term (as in, verifying your
> account when signing up to a forum). I get the idea that unclaimed
> implies there's a risk, and sure there is, but this smacks of using a
> loaded term to rub people's noses in the fact that what they've been
> happily using for years "isn't safe". This happens a lot with security
> debates, and IMO actively discourages people from buying into the
> changes.

Unclaimed is probably a bad name for it, although this name wouldn’t
actually be exposed to end users. It’s sort of like if we had rel=“unclaimed”
links on /simple/. 

> It would be useful to have *some* document (or part thereof - maybe an
> overview section in the PEP) structured as an objective cost/benefit
> proposal:
> 1. The current PyPI infrastructure has the following risks. We assess
> the chance that they might occur as X.
> 2. The impact on the reader, as an individual, of a compromise, would
> be the following.
> 3. The cost to the reader, under the proposed solution, of avoiding
> the risks, is the following.
> There are probably at least two classes of reader involved - project
> authors and project users. If in either case one class of user has to
> bear some cost on behalf of the other, then that should be called out.
> I believe that I (and any other readers of the proposals) should be
> able to sensibly assess the cost/benefit tradeoffs on my own, given
> the above information. My experience and judgement may not be typical,
> so my opinion should be taken in the context of others, but that
> doesn't mean I'm wrong, or that my views are inappropriate for me. For
> example, in 20 years of extensively using software off the internet, I
> have never once downloaded a piece of software that wasn't what it
> claimed, and I expected it, to be. So the discussion of compromising
> PyPI packages seems like an amazingly low risk to me[1].

See an upcoming email about refocusing the discussion here.

>> The periodic signing process by the PyPI admins just moves a new project from
>> being signed for by the "Unclaimed" online key to being signed for by our
>> offline keys. This process is basically invisible to everyone involved.
> It's as visible to end users as the significance of describing
> something as "unclaimed". If nobody cared a project was "unclaimed"
> then it would be invisible. Otherwise, less so. Hence my preference
> for a less emotive term.
>> Does that make sense?
> Yes it does - many thanks for the explanation.
> Paul
> [1] That's just an example, and it would be off-topic to debate the
> various other things that overall contribute to why and to what level
> I'm currently comfortable using PyPI. And I'm not running a business
> that would fail if PyPI were compromised. So please just take this as
> a small data point, nothing more.

Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

More information about the Distutils-SIG mailing list