vetting, signing, verification of release files

I am considering implementing gpg-signing and verification of release files for devpi. Rather than requiring package authors to sign their release files, i am pondering a scheme where anyone can vet for a particular published release file by publishing a signature about it. This aims to help responsible companies to work together. I've heart from devops/admins that they manually download and check release files and then install it offline after some vetting. Wouldn't it be useful to turn this into a more collaborative effort? Any thoughts or pointers to existing efforts within the (Python) packaging ecologies? best, holger

On 16.07.2013, at 11:19, holger krekel <holger@merlinux.eu> wrote:
Any thoughts or pointers to existing efforts within the (Python) packaging ecologies?
Erik Rose just released peep the other day [1], which admittedly doesn't use gpg but at least allows pip users to simplify the manual vetting process. Jannis 1: https://pypi.python.org/pypi/peep

Am 16.07.2013 12:21, schrieb Jannis Leidel:
On 16.07.2013, at 11:19, holger krekel <holger@merlinux.eu> wrote:
Any thoughts or pointers to existing efforts within the (Python) packaging ecologies?
Erik Rose just released peep the other day [1], which admittedly doesn't use gpg but at least allows pip users to simplify the manual vetting process.
Peep is a bit scary because the author doesn't have much confidence in his own crypto fu: "Proof of concept. Does all the crypto stuff. Should be secure." Peep doesn't protect you from at least on DoS attack scenario. The tool does neither verify nor limit the size of a downloaded file. In theory an active attacker could make you download an arbitrarily large file in order to clog your network pipes. Eventually your machine runs out of disk space, too. I'd feel much better if such a tool would verify both hashsum and file size. Christian

On Tue, Jul 16, 2013 at 12:21 +0200, Jannis Leidel wrote:
On 16.07.2013, at 11:19, holger krekel <holger@merlinux.eu> wrote:
Any thoughts or pointers to existing efforts within the (Python) packaging ecologies?
Erik Rose just released peep the other day [1], which admittedly doesn't use gpg but at least allows pip users to simplify the manual vetting process.
Jannis
thanks for the pointer, i actually saw that earlier. If i see it correctly it does not target "vetting sharing": if a 1000 careful people want to install Django-1.5.1.tar.gz they each need to do the verification work individually, each creating their particular "requirements.txt" with extra hashes. best, holger

On Jul 16, 2013, at 5:19 AM, holger krekel <holger@merlinux.eu> wrote:
I am considering implementing gpg-signing and verification of release files for devpi. Rather than requiring package authors to sign their release files, i am pondering a scheme where anyone can vet for a particular published release file by publishing a signature about it. This aims to help responsible companies to work together. I've heart from devops/admins that they manually download and check release files and then install it offline after some vetting. Wouldn't it be useful to turn this into a more collaborative effort?
Any thoughts or pointers to existing efforts within the (Python) packaging ecologies?
best, holger
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
So I'm not entirely sure what your goals are here. What exactly are you verifying? What is going to verify signatures once you have a (theoretically) trusted set? What is going to keep a malicious actor from poisoning the well? ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Tue, Jul 16, 2013 at 13:57 -0400, Donald Stufft wrote:
On Jul 16, 2013, at 5:19 AM, holger krekel <holger@merlinux.eu> wrote:
I am considering implementing gpg-signing and verification of release files for devpi. Rather than requiring package authors to sign their release files, i am pondering a scheme where anyone can vet for a particular published release file by publishing a signature about it. This aims to help responsible companies to work together.
So I'm not entirely sure what your goals are here.
The goal is to facilitate collaboration between individuals and companies in vetting the integrity and, to some degree, authenticity of a published pypi package.
What exactly are you verifying? What is going to verify signatures once you have a (theoretically) trusted set? What is going to keep a malicious actor from poisoning the well?
These are typical questions which is why i asked if anyone knows about existing schemes/efforts. I guess most Linux distros do it already so if nothing comes up here PyPI-specific (what is the status of TUF, btw?) i am going to look into the distro's working models. One difference is that i want the vetting/signing to happen after publishing to allow for an incremental approach. cheers, holger

holger krekel <holger <at> merlinux.eu> writes:
about existing schemes/efforts. I guess most Linux distros do it already so if nothing comes up here PyPI-specific (what is the status of TUF, btw?) i am going to look into the distro's working models.
ISTM it works for distros because they're the central authority guaranteeing the provenance of the software in their repos. It's harder with PyPI because it's not a central authority curating the content. Perhaps something like a web of trust would be needed. Regards, Vinay Sajip

On Wed, Jul 17, 2013 at 07:48 +0000, Vinay Sajip wrote:
holger krekel <holger <at> merlinux.eu> writes:
about existing schemes/efforts. I guess most Linux distros do it already so if nothing comes up here PyPI-specific (what is the status of TUF, btw?) i am going to look into the distro's working models.
ISTM it works for distros because they're the central authority guaranteeing the provenance of the software in their repos. It's harder with PyPI because it's not a central authority curating the content. Perhaps something like a web of trust would be needed.
I am thinking about curating release files _after_ publishing and then configuring install activities to require "signed-off" release files. Basically giving companies and devops the possibility to organise their vetting processes and collaborate, without requiring PyPI to change first. This certainly involves the question of trust but if nothing else an entity can at least trust its own signatures :) best, holger

On 17 Jul 2013 18:17, "holger krekel" <holger@merlinux.eu> wrote:
On Wed, Jul 17, 2013 at 07:48 +0000, Vinay Sajip wrote:
holger krekel <holger <at> merlinux.eu> writes:
about existing schemes/efforts. I guess most Linux distros do it
so if nothing comes up here PyPI-specific (what is the status of TUF, btw?) i am going to look into the distro's working models.
ISTM it works for distros because they're the central authority guaranteeing the provenance of the software in their repos. It's harder with PyPI because it's not a central authority curating the content. Perhaps something
already like a
web of trust would be needed.
I am thinking about curating release files _after_ publishing and then configuring install activities to require "signed-off" release files. Basically giving companies and devops the possibility to organise their vetting processes and collaborate, without requiring PyPI to change first. This certainly involves the question of trust but if nothing else an entity can at least trust its own signatures :)
Note that Linux distros don't trust each other's keys and nor do app stores trust other. Secure collaborative vetting of software is an Unsolved Problem. The Update Framework provides a solid technical basis for such collaboration, but even it doesn't solve the fundamental trust issues. Those issues are why we still rely on the CA model for SSL, despite its serious flaws: nobody has come up with anything else that scales effectively. The use of JSON for metadata 2.0 is enough to make it TUF friendly, but there are significant key management issues to be addressed before TUF could be used on PyPI itself. That's no reason to avoid experimenting with private TUF enabled PyPI servers, though - a private server alleviates most of the ugly key management problems. Cheers, Nick.
best, holger
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux)
iQEcBAEBAgAGBQJR5lLnAAoJEI47A6J5t3LWUkYIAJ1qqyAc185R7NrXqJyEpNo6 erDSfCMROcMqxtkqLCeoaiKSSBnhyJJpcLJ9a5P2/z8hBsYVTKM54NdOpvJEcgb/ s/sepYI3vTIXFtUyRTxXPmhUZoxgh+GdvatCWw+7EA8pcAPs3YvrdKPYqHOm3xup Z1KWAUrPWhVxoUY8laUBaHkHxX3WJ88Hj0buJfzsKEbQvytT8sRO9Nq03VE5EsjL 85boVh4UIA0KUMtEgzxgRGDjD9Cc47ukFrmN/ViYKdmV6gmIBV1h30dcRXhvof5W QSuuROqXjQ466Vm5aaE7rfLzIAOtxOvjBuZLygr2bMbZYY8WtHDJD7e0VYFJPCw= =vZ9n -----END PGP SIGNATURE-----
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

On 07/17/2013 04:50 PM, Nick Coghlan wrote:
On 17 Jul 2013 18:17, "holger krekel" <holger@merlinux.eu <mailto:holger@merlinux.eu>> wrote:
On Wed, Jul 17, 2013 at 07:48 +0000, Vinay Sajip wrote:
holger krekel <holger <at> merlinux.eu <http://merlinux.eu>> writes:
about existing schemes/efforts. I guess most Linux distros do
it already
so if nothing comes up here PyPI-specific (what is the status of TUF, btw?) i am going to look into the distro's working models.
ISTM it works for distros because they're the central authority guaranteeing the provenance of the software in their repos. It's harder with PyPI because it's not a central authority curating the content. Perhaps something like a web of trust would be needed.
I am thinking about curating release files _after_ publishing and then configuring install activities to require "signed-off" release files. Basically giving companies and devops the possibility to organise their vetting processes and collaborate, without requiring PyPI to change first. This certainly involves the question of trust but if nothing else an entity can at least trust its own signatures :)
Note that Linux distros don't trust each other's keys and nor do app stores trust other. Secure collaborative vetting of software is an Unsolved Problem. The Update Framework provides a solid technical basis for such collaboration, but even it doesn't solve the fundamental trust issues. Those issues are why we still rely on the CA model for SSL, despite its serious flaws: nobody has come up with anything else that scales effectively.
The use of JSON for metadata 2.0 is enough to make it TUF friendly, but there are significant key management issues to be addressed before TUF could be used on PyPI itself. That's no reason to avoid experimenting with private TUF enabled PyPI servers, though - a private server alleviates most of the ugly key management problems.
Thank you, Nick. Indeed, we think that TUF (designed in collaboration with some of chief designers of the Tor project) offers a secure yet usable way to address many classes of attacks on package managers, many previously left unconsidered in the Linux distribution community until we pointed it out to them, at which point they adopted our security proposals (https://isis.poly.edu/~jcappos/papers/cappos_mirror_ccs_08.pdf). We are delighted to see that JSON is being used for PyPI metadata 2.0, which would certainly lend itself very easily for integration with TUF. Speaking of which, let me answer some questions about the current status of PyPI and pip over TUF. TLDR: We now have a pretty good scheme balancing key management with security for PyPI. At the time of writing, I have an almost identical version of pip ready anytime to read metadata off a TUF-secured PyPI mirror. There is just one thing left to do: I need to just compress the metadata as much as possible (a problem common to all package managers). I expect this to be done in the next two weeks, by which time we should have a slightly modified version of pip which would securely download packages from an up-to-date TUF-secured PyPI mirror. (Aside: let me say that we are discussing all things related to PyPI, pip and TUF on the TUF mailing list (https://groups.google.com/forum/?fromgroups#!forum/theupdateframework). I welcome you to join our mailing list so that we can continue the discussion. I did not want to incessantly copy our discussions to the DistUtils mailing list because I was not sure whether it would be always relevant to the DistUtils SIG which is already busy with a number of other projects. In retrospect, perhaps I should have summarized our findings every now and then on this list, because I can understand that it looks to some people as though we have been silent, when in fact that was not the case.) To very briefly summarize our status without going into tangential details: 1. We previously found and reported on this mailing list that if we naively assigned a key to every PyPI project, then the metadata would not scale. We would have security with little usability. This looks like an insoluble key management problem, but we think we have a pretty good solution. 2. The solution is briefly this: we now propose just two targets roles for all PyPI files. 2.1. The first role --- called the "unstable" targets role --- will have completely online keys (meaning that it can kept on the server for automated release purposes). The unstable role will sign for all PyPI files being added, updated or deleted without question. The metadata for this role will change all the time. 2.2. The second role --- called the "stable" targets role --- will have completely offline keys (meaning that keys are kept as securely as possible and only used with manual human intervention). The stable role will sign for only the PyPI files which have vetted and deemed trustworthy. The metadata for this role is expected to change much less frequently than the unstable role. Okay, sounds too abstract to some. What does this mean in practice? We want to make key management simple. Preferably, as Nick Coghlan and others have proposed before, we would want PyPI to initially, at least, sign for all packages, because managing keys for every single project right off the bat is potentially painful. Therefore, with that view in mind --- which is to first accommodate PyPI signing for packages, and gradually allowing projects to sign for their own packages --- we then consider what our proposal above would do. Firstly, it would make key management so much simpler. There is a sufficient number of offline keys used to sign metadata for a valuable and trustworthy set of packages (done only every now and then), and an online key used to make continuous release of PyPI packages possible (done all the time). 1. Now suppose that the top-level targets role says: when you download a package, you must first always ask the stable role about it. If it has something to say about it, then use that information (and just ignore the unstable role). Otherwise, ask the unstable role about it. 2. Fine, what about that? Now suppose that the both the stable and unstable roles have signed for some very popular package called FooBar 2.0. Suppose further that attackers have broken into the TUF-secured PyPI repository. Oh, they can't find the keys to the stable role, so they can't mess with the stable role metadata without getting caught, but since the unstable keys are online, they could make it sign for malicious versions of the FooBar 2.0 package. 3. But no problem there! Since we have instructed that the stable role must always be consulted first, then valid metadata about the intended, trusted FooBar 2.0 package cannot be modified (not without getting all the human owners of the keys to collude). The unstable role may be tampered with to offer bogus metadata, but the security impact will be limited with *prior* metadata about packages in the way-harder-to-attack stable role. More details, should you be interested, are available here: https://groups.google.com/forum/?fromgroups#!topic/theupdateframework/pocW9b... I hope that answers a number of questions. Let us know if you have more questions, and I think I can safely conclude that I can start discussing TUF on this mailing list again! PS: Pardon any delay in my response in the next couple of days, as I will be flying for a day or so to New York in approximately 24 hours.

On 17 Jul, 2013, at 19:17, Trishank Karthik Kuppusamy <tk47@students.poly.edu> wrote:
To very briefly summarize our status without going into tangential details:
1. We previously found and reported on this mailing list that if we naively assigned a key to every PyPI project, then the metadata would not scale. We would have security with little usability. This looks like an insoluble key management problem, but we think we have a pretty good solution. 2. The solution is briefly this: we now propose just two targets roles for all PyPI files. 2.1. The first role --- called the "unstable" targets role --- will have completely online keys (meaning that it can kept on the server for automated release purposes). The unstable role will sign for all PyPI files being added, updated or deleted without question. The metadata for this role will change all the time. 2.2. The second role --- called the "stable" targets role --- will have completely offline keys (meaning that keys are kept as securely as possible and only used with manual human intervention). The stable role will sign for only the PyPI files which have vetted and deemed trustworthy. The metadata for this role is expected to change much less frequently than the unstable role.
Okay, sounds too abstract to some. What does this mean in practice? We want to make key management simple. Preferably, as Nick Coghlan and others have proposed before, we would want PyPI to initially, at least, sign for all packages, because managing keys for every single project right off the bat is potentially painful. Therefore, with that view in mind --- which is to first accommodate PyPI signing for packages, and gradually allowing projects to sign for their own packages --- we then consider what our proposal above would do.
Firstly, it would make key management so much simpler. There is a sufficient number of offline keys used to sign metadata for a valuable and trustworthy set of packages (done only every now and then), and an online key used to make continuous release of PyPI packages possible (done all the time).
1. Now suppose that the top-level targets role says: when you download a package, you must first always ask the stable role about it. If it has something to say about it, then use that information (and just ignore the unstable role). Otherwise, ask the unstable role about it. 2. Fine, what about that? Now suppose that the both the stable and unstable roles have signed for some very popular package called FooBar 2.0. Suppose further that attackers have broken into the TUF-secured PyPI repository. Oh, they can't find the keys to the stable role, so they can't mess with the stable role metadata without getting caught, but since the unstable keys are online, they could make it sign for malicious versions of the FooBar 2.0 package. 3. But no problem there! Since we have instructed that the stable role must always be consulted first, then valid metadata about the intended, trusted FooBar 2.0 package cannot be modified (not without getting all the human owners of the keys to collude). The unstable role may be tampered with to offer bogus metadata, but the security impact will be limited with *prior* metadata about packages in the way-harder-to-attack stable role.
I'm trying to understand what this means for package maintainers. If I understand you correctly maintainers would upload packages just like they do now, and packages are then automaticly signed by the "unstable" role. Then some manual process by the PyPI maintainers can sign a package with a stable row. Is that correct? If it is, how is this supposed to scale? The contents of PyPI is currently not vetted at all, and it seems to me that manually vetting uploads for even the most popular packages would be a significant amount of work that would have to be done by what's likely a small set of volunteers. Also, what are you supposed to do when FooBar 2.0 is signed by the stable role and FooBar 2.0.1 is only signed by the unstable role, and you try to fetch FooBar 2.0.* (that is, 2.0 or any 2.0.x point release)? Ronald

Essentially, nothing changes from the user's standpoint or from the standpoint of the package developer (except they sign their package). The reason why we have multiple roles is to be robust against attacks in case the main PyPI repo is hacked. (Trishank can chime in with more complete / precise information once he's back.) Thanks, Justin On Wed, Jul 17, 2013 at 3:24 PM, Ronald Oussoren <ronaldoussoren@mac.com>wrote:
On 17 Jul, 2013, at 19:17, Trishank Karthik Kuppusamy < tk47@students.poly.edu> wrote:
To very briefly summarize our status without going into tangential
details:
1. We previously found and reported on this mailing list that if we
2. The solution is briefly this: we now propose just two targets roles for all PyPI files. 2.1. The first role --- called the "unstable" targets role --- will have completely online keys (meaning that it can kept on the server for automated release purposes). The unstable role will sign for all PyPI files being added, updated or deleted without question. The metadata for this role will change all the time. 2.2. The second role --- called the "stable" targets role --- will have completely offline keys (meaning that keys are kept as securely as possible and only used with manual human intervention). The stable role will sign for only the PyPI files which have vetted and deemed trustworthy. The
naively assigned a key to every PyPI project, then the metadata would not scale. We would have security with little usability. This looks like an insoluble key management problem, but we think we have a pretty good solution. metadata for this role is expected to change much less frequently than the unstable role.
Okay, sounds too abstract to some. What does this mean in practice? We
want to make key management simple. Preferably, as Nick Coghlan and others have proposed before, we would want PyPI to initially, at least, sign for all packages, because managing keys for every single project right off the bat is potentially painful. Therefore, with that view in mind --- which is to first accommodate PyPI signing for packages, and gradually allowing projects to sign for their own packages --- we then consider what our proposal above would do.
Firstly, it would make key management so much simpler. There is a
sufficient number of offline keys used to sign metadata for a valuable and trustworthy set of packages (done only every now and then), and an online key used to make continuous release of PyPI packages possible (done all the time).
1. Now suppose that the top-level targets role says: when you download a
2. Fine, what about that? Now suppose that the both the stable and unstable roles have signed for some very popular package called FooBar 2.0. Suppose further that attackers have broken into the TUF-secured PyPI repository. Oh, they can't find the keys to the stable role, so they can't mess with the stable role metadata without getting caught, but since the unstable keys are online, they could make it sign for malicious versions of
3. But no problem there! Since we have instructed that the stable role must always be consulted first, then valid metadata about the intended,
package, you must first always ask the stable role about it. If it has something to say about it, then use that information (and just ignore the unstable role). Otherwise, ask the unstable role about it. the FooBar 2.0 package. trusted FooBar 2.0 package cannot be modified (not without getting all the human owners of the keys to collude). The unstable role may be tampered with to offer bogus metadata, but the security impact will be limited with *prior* metadata about packages in the way-harder-to-attack stable role.
I'm trying to understand what this means for package maintainers. If I understand you correctly maintainers would upload packages just like they do now, and packages are then automaticly signed by the "unstable" role. Then some manual process by the PyPI maintainers can sign a package with a stable row. Is that correct? If it is, how is this supposed to scale? The contents of PyPI is currently not vetted at all, and it seems to me that manually vetting uploads for even the most popular packages would be a significant amount of work that would have to be done by what's likely a small set of volunteers.
Also, what are you supposed to do when FooBar 2.0 is signed by the stable role and FooBar 2.0.1 is only signed by the unstable role, and you try to fetch FooBar 2.0.* (that is, 2.0 or any 2.0.x point release)?
Ronald _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig

On 07/18/2013 03:24 AM, Ronald Oussoren wrote:
I'm trying to understand what this means for package maintainers. If I understand you correctly maintainers would upload packages just like they do now, and packages are then automaticly signed by the "unstable" role. Then some manual process by the PyPI maintainers can sign a package with a stable row. Is that correct? If it is, how is this supposed to scale? The contents of PyPI is currently not vetted at all, and it seems to me that manually vetting uploads for even the most popular packages would be a significant amount of work that would have to be done by what's likely a small set of volunteers.
I think Daniel put it best when he said that we have been focusing too much on deciding whether or not a package is malicious. As he said, it is important that any security proposal must limit what targeted attacks on the PyPI infrastructure can do. You are right that asking people to vet through packages for inclusion into the stable role would be generally unscalable. I think the best way to think about it is that we can mostly decide a "stable" set of packages with a simple rule, and then *choose* to interfere (if necessary) with decisions on which packages go in or out of the stable role. The stable role simply has to sign this automatically computed set of "stable" packages every now and then, so that the impacts of attacks on the PyPI infrastructure are limited. Users who install the same set of stable packages will see the installation of the same set of intended packages. Presently, I use a simple heuristic to compute a nominal set of stable packages: all files older than 3 months are considered to be "stable". There is no consideration of whether a package is malicious here; just that it has not changed long enough to be considered mature.
Also, what are you supposed to do when FooBar 2.0 is signed by the stable role and FooBar 2.0.1 is only signed by the unstable role, and you try to fetch FooBar 2.0.* (that is, 2.0 or any 2.0.x point release)?
In this case, I expect that since we have asked pip to install FooBar 2.0.*, it will first fetch the /simple/FooBar/ PyPI metadata (distinct from TUF metadata) to see what versions of the FooBar package are available. If FooBar 2.0.1 was recently added, then the latest version of the /simple/FooBar/ metadata would have been signed for the unstable role. There are two cases for the stable role: 1. The stable role has also signed for the FooBar 2.0.1 package. In this case, pip would find FooBar 2.0.1 and install it. 2. The stable role has not yet signed for the FooBar 2.0.1 package. In this case, pip would find FooBar 2.0 and install it. Why would this happen? In this case, we have specified in the TUF metadata that if the same file (in this case, the /simple/FooBar/ HTML file) has been signed for by both the stable and unstable roles, then the client must prefer the version from the stable role. Of course, there are questions about timeliness. Sometimes users want the latest packages, or the developers of the packages themselves may want this to be the case. For the purposes of bootstrapping PyPI with TUF, we have presently decided to simplify key management and allow for the protection of some valuable packages on PyPI (with limited timeliness trade-off) while allowing for the majority of the packages to be continuously released. There are a few ways to ensure that the latest intended versions of the FooBar package will be installed: 1. Do not nominate FooBar into the "stable" set of packages, which should ideally be reserved --- for initial bootstrapping purposes at least --- for perhaps what the community thinks are the "canonical" packages that must initially be protected from attacks. 2. The stable role may delegate its responsibility about information on the FooBar package to the FooBar package developers themselves. 3. Explore different rules (other than just ordering roles by trust) to balance key management, timeliness and other issues without significantly sacrificing security. We welcome your thoughts here. For the moment, we are planning to wrap up as soon as possible our experiments on how PyPI+pip perform with and without TUF with this particular scheme of stable and unstable roles.

My impression is this only holds for things signed directly by PyPI because the developers have not registered a key. I think that developers who register keys won't have this issue. Let's talk about this when you return, but it's really projects / developers that will be stable in the common case, not packages, right? Justin On Wed, Jul 17, 2013 at 9:29 PM, Trishank Karthik Kuppusamy < tk47@students.poly.edu> wrote:
On 07/18/2013 03:24 AM, Ronald Oussoren wrote:
I'm trying to understand what this means for package maintainers. If I understand you correctly maintainers would upload packages just like they do now, and packages are then automaticly signed by the "unstable" role. Then some manual process by the PyPI maintainers can sign a package with a stable row. Is that correct? If it is, how is this supposed to scale? The contents of PyPI is currently not vetted at all, and it seems to me that manually vetting uploads for even the most popular packages would be a significant amount of work that would have to be done by what's likely a small set of volunteers.
I think Daniel put it best when he said that we have been focusing too much on deciding whether or not a package is malicious. As he said, it is important that any security proposal must limit what targeted attacks on the PyPI infrastructure can do.
You are right that asking people to vet through packages for inclusion into the stable role would be generally unscalable. I think the best way to think about it is that we can mostly decide a "stable" set of packages with a simple rule, and then *choose* to interfere (if necessary) with decisions on which packages go in or out of the stable role. The stable role simply has to sign this automatically computed set of "stable" packages every now and then, so that the impacts of attacks on the PyPI infrastructure are limited. Users who install the same set of stable packages will see the installation of the same set of intended packages.
Presently, I use a simple heuristic to compute a nominal set of stable packages: all files older than 3 months are considered to be "stable". There is no consideration of whether a package is malicious here; just that it has not changed long enough to be considered mature.
Also, what are you supposed to do when FooBar 2.0 is signed by the stable
role and FooBar 2.0.1 is only signed by the unstable role, and you try to fetch FooBar 2.0.* (that is, 2.0 or any 2.0.x point release)?
In this case, I expect that since we have asked pip to install FooBar 2.0.*, it will first fetch the /simple/FooBar/ PyPI metadata (distinct from TUF metadata) to see what versions of the FooBar package are available. If FooBar 2.0.1 was recently added, then the latest version of the /simple/FooBar/ metadata would have been signed for the unstable role. There are two cases for the stable role:
1. The stable role has also signed for the FooBar 2.0.1 package. In this case, pip would find FooBar 2.0.1 and install it. 2. The stable role has not yet signed for the FooBar 2.0.1 package. In this case, pip would find FooBar 2.0 and install it.
Why would this happen? In this case, we have specified in the TUF metadata that if the same file (in this case, the /simple/FooBar/ HTML file) has been signed for by both the stable and unstable roles, then the client must prefer the version from the stable role.
Of course, there are questions about timeliness. Sometimes users want the latest packages, or the developers of the packages themselves may want this to be the case. For the purposes of bootstrapping PyPI with TUF, we have presently decided to simplify key management and allow for the protection of some valuable packages on PyPI (with limited timeliness trade-off) while allowing for the majority of the packages to be continuously released.
There are a few ways to ensure that the latest intended versions of the FooBar package will be installed: 1. Do not nominate FooBar into the "stable" set of packages, which should ideally be reserved --- for initial bootstrapping purposes at least --- for perhaps what the community thinks are the "canonical" packages that must initially be protected from attacks. 2. The stable role may delegate its responsibility about information on the FooBar package to the FooBar package developers themselves. 3. Explore different rules (other than just ordering roles by trust) to balance key management, timeliness and other issues without significantly sacrificing security.
We welcome your thoughts here. For the moment, we are planning to wrap up as soon as possible our experiments on how PyPI+pip perform with and without TUF with this particular scheme of stable and unstable roles.

On 07/18/2013 09:34 AM, Justin Cappos wrote:
My impression is this only holds for things signed directly by PyPI because the developers have not registered a key. I think that developers who register keys won't have this issue. Let's talk about this when you return, but it's really projects / developers that will be stable in the common case, not packages, right?
Yes, developers who register keys and have the stable role delegate their packages to themselves will not have this issue. When I say "package", I mean what gets downloaded and installed when pip goes to PyPI to get a package with exactly the given name. I am not aware of a way to guide pip to install packages by projects (could you clarify what you mean by this?) or developers, but perhaps this might change in the future with PyPI metadata 2.0.

On Jul 17, 2013, at 9:29 PM, Trishank Karthik Kuppusamy <tk47@students.poly.edu> wrote:
On 07/18/2013 03:24 AM, Ronald Oussoren wrote:
I'm trying to understand what this means for package maintainers. If I understand you correctly maintainers would upload packages just like they do now, and packages are then automaticly signed by the "unstable" role. Then some manual process by the PyPI maintainers can sign a package with a stable row. Is that correct? If it is, how is this supposed to scale? The contents of PyPI is currently not vetted at all, and it seems to me that manually vetting uploads for even the most popular packages would be a significant amount of work that would have to be done by what's likely a small set of volunteers.
I think Daniel put it best when he said that we have been focusing too much on deciding whether or not a package is malicious. As he said, it is important that any security proposal must limit what targeted attacks on the PyPI infrastructure can do.
As I've mentioned before an online key (as is required by PyPI) means that if someone compromises PyPI they compromise the key. It seems to me that TUF is really designed to handle the case of the Linux distribution (or similar) where you have vetted maintainers who are given a subsection of the total releases. However PyPI does not have vetted authors nor the man power to sign authors keys offline. PyPI and a Linux Distro repo solve problems that appear similar but are actually quite different under the surface. I do agree however that PyPI should not attempt to discern what is malicious or not.
You are right that asking people to vet through packages for inclusion into the stable role would be generally unscalable. I think the best way to think about it is that we can mostly decide a "stable" set of packages with a simple rule, and then *choose* to interfere (if necessary) with decisions on which packages go in or out of the stable role. The stable role simply has to sign this automatically computed set of "stable" packages every now and then, so that the impacts of attacks on the PyPI infrastructure are limited. Users who install the same set of stable packages will see the installation of the same set of intended packages.
Presently, I use a simple heuristic to compute a nominal set of stable packages: all files older than 3 months are considered to be "stable". There is no consideration of whether a package is malicious here; just that it has not changed long enough to be considered mature.
Also, what are you supposed to do when FooBar 2.0 is signed by the stable role and FooBar 2.0.1 is only signed by the unstable role, and you try to fetch FooBar 2.0.* (that is, 2.0 or any 2.0.x point release)?
In this case, I expect that since we have asked pip to install FooBar 2.0.*, it will first fetch the /simple/FooBar/ PyPI metadata (distinct from TUF metadata) to see what versions of the FooBar package are available. If FooBar 2.0.1 was recently added, then the latest version of the /simple/FooBar/ metadata would have been signed for the unstable role. There are two cases for the stable role:
1. The stable role has also signed for the FooBar 2.0.1 package. In this case, pip would find FooBar 2.0.1 and install it. 2. The stable role has not yet signed for the FooBar 2.0.1 package. In this case, pip would find FooBar 2.0 and install it.
And things are stable after 3 months? This sounds completely insane. So if a package releases a security update it'll be 3 months until people get that fix by default?
Why would this happen? In this case, we have specified in the TUF metadata that if the same file (in this case, the /simple/FooBar/ HTML file) has been signed for by both the stable and unstable roles, then the client must prefer the version from the stable role.
Of course, there are questions about timeliness. Sometimes users want the latest packages, or the developers of the packages themselves may want this to be the case. For the purposes of bootstrapping PyPI with TUF, we have presently decided to simplify key management and allow for the protection of some valuable packages on PyPI (with limited timeliness trade-off) while allowing for the majority of the packages to be continuously released.
There are a few ways to ensure that the latest intended versions of the FooBar package will be installed: 1. Do not nominate FooBar into the "stable" set of packages, which should ideally be reserved --- for initial bootstrapping purposes at least --- for perhaps what the community thinks are the "canonical" packages that must initially be protected from attacks. 2. The stable role may delegate its responsibility about information on the FooBar package to the FooBar package developers themselves. 3. Explore different rules (other than just ordering roles by trust) to balance key management, timeliness and other issues without significantly sacrificing security.
We welcome your thoughts here. For the moment, we are planning to wrap up as soon as possible our experiments on how PyPI+pip perform with and without TUF with this particular scheme of stable and unstable roles.
----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

If there is not a compromise of PyPI, then all updates happen essentially instantly. Developers that do not sign packages and so PyPI signs them, may have their newest packages remain unavailable for a period of up to 3 months *if there is a compromise of PyPI*. Thanks, Justin On Wed, Jul 17, 2013 at 9:46 PM, Donald Stufft <donald@stufft.io> wrote:
On Jul 17, 2013, at 9:29 PM, Trishank Karthik Kuppusamy < tk47@students.poly.edu> wrote:
On 07/18/2013 03:24 AM, Ronald Oussoren wrote:
I'm trying to understand what this means for package maintainers. If I understand you correctly maintainers would upload packages just like they do now, and packages are then automaticly signed by the "unstable" role. Then some manual process by the PyPI maintainers can sign a package with a stable row. Is that correct? If it is, how is this supposed to scale? The contents of PyPI is currently not vetted at all, and it seems to me that manually vetting uploads for even the most popular packages would be a significant amount of work that would have to be done by what's likely a small set of volunteers.
I think Daniel put it best when he said that we have been focusing too much on deciding whether or not a package is malicious. As he said, it is important that any security proposal must limit what targeted attacks on the PyPI infrastructure can do.
As I've mentioned before an online key (as is required by PyPI) means that if someone compromises PyPI they compromise the key. It seems to me that TUF is really designed to handle the case of the Linux distribution (or similar) where you have vetted maintainers who are given a subsection of the total releases. However PyPI does not have vetted authors nor the man power to sign authors keys offline.
PyPI and a Linux Distro repo solve problems that appear similar but are actually quite different under the surface.
I do agree however that PyPI should not attempt to discern what is malicious or not.
You are right that asking people to vet through packages for inclusion
into the stable role would be generally unscalable. I think the best way to think about it is that we can mostly decide a "stable" set of packages with a simple rule, and then *choose* to interfere (if necessary) with decisions on which packages go in or out of the stable role. The stable role simply has to sign this automatically computed set of "stable" packages every now and then, so that the impacts of attacks on the PyPI infrastructure are limited. Users who install the same set of stable packages will see the installation of the same set of intended packages.
Presently, I use a simple heuristic to compute a nominal set of stable
packages: all files older than 3 months are considered to be "stable". There is no consideration of whether a package is malicious here; just that it has not changed long enough to be considered mature.
Also, what are you supposed to do when FooBar 2.0 is signed by the
In this case, I expect that since we have asked pip to install FooBar 2.0.*, it will first fetch the /simple/FooBar/ PyPI metadata (distinct from TUF metadata) to see what versions of the FooBar package are available. If FooBar 2.0.1 was recently added, then the latest version of the /simple/FooBar/ metadata would have been signed for the unstable role. There are two cases for the stable role:
1. The stable role has also signed for the FooBar 2.0.1 package. In this case, pip would find FooBar 2.0.1 and install it. 2. The stable role has not yet signed for the FooBar 2.0.1 package. In
stable role and FooBar 2.0.1 is only signed by the unstable role, and you try to fetch FooBar 2.0.* (that is, 2.0 or any 2.0.x point release)? this case, pip would find FooBar 2.0 and install it.
And things are stable after 3 months? This sounds completely insane. So if a package releases a security update it'll be 3 months until people get that fix by default?
Why would this happen? In this case, we have specified in the TUF
metadata that if the same file (in this case, the /simple/FooBar/ HTML file) has been signed for by both the stable and unstable roles, then the client must prefer the version from the stable role.
Of course, there are questions about timeliness. Sometimes users want
the latest packages, or the developers of the packages themselves may want this to be the case. For the purposes of bootstrapping PyPI with TUF, we have presently decided to simplify key management and allow for the protection of some valuable packages on PyPI (with limited timeliness trade-off) while allowing for the majority of the packages to be continuously released.
There are a few ways to ensure that the latest intended versions of the
1. Do not nominate FooBar into the "stable" set of packages, which should ideally be reserved --- for initial bootstrapping purposes at least --- for perhaps what the community thinks are the "canonical" packages that must initially be protected from attacks. 2. The stable role may delegate its responsibility about information on
FooBar package will be installed: the FooBar package to the FooBar package developers themselves.
3. Explore different rules (other than just ordering roles by trust) to balance key management, timeliness and other issues without significantly sacrificing security.
We welcome your thoughts here. For the moment, we are planning to wrap up as soon as possible our experiments on how PyPI+pip perform with and without TUF with this particular scheme of stable and unstable roles.
----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Jul 17, 2013, at 9:52 PM, Justin Cappos <jcappos@poly.edu> wrote:
If there is not a compromise of PyPI, then all updates happen essentially instantly.
Developers that do not sign packages and so PyPI signs them, may have their newest packages remain unavailable for a period of up to 3 months *if there is a compromise of PyPI*.
Can you go into details about how things will graduate from unstable to stable instantly in a way that a compromise of PyPI doesn't also allow that?
Thanks, Justin

Sure. The "stable" key is kept offline (not on PyPI). It knows who the developers for projects are and delegates trust to them. So Django (for example), has its key signed by this offline key. The "bleeding-edge" key is kept online on PyPI. It is used to sign project keys for projects newer than the last use of the stable key. If I register new project "mycoolnewpypiproject" and choose to sign my packages then it delegates trust to me. Importantly, if the stable and bleeding-edge roles trust the same project name with different keys, the stable role's key is used. A malicious attacker that can hack PyPI can get access to the bleeding-edge key and also some other items that say how timely the data is and similar things. They could say that "mycoolnewpypiproject" is actually signed by a different key than mine because they possess the bleeding-edge role. However, they can't (convincingly) say that Django is signed by a different key because the stable key already has this role listed. Sorry for any confusion about this. We will provide a bunch of other information soon (should we do this as a PEP?) along with example metadata and working code. We definitely appreciate any feedback. Thanks, Justin On Wed, Jul 17, 2013 at 9:54 PM, Donald Stufft <donald@stufft.io> wrote:
On Jul 17, 2013, at 9:52 PM, Justin Cappos <jcappos@poly.edu> wrote:
If there is not a compromise of PyPI, then all updates happen essentially instantly.
Developers that do not sign packages and so PyPI signs them, may have their newest packages remain unavailable for a period of up to 3 months *if there is a compromise of PyPI*.
Can you go into details about how things will graduate from unstable to stable instantly in a way that a compromise of PyPI doesn't also allow that?
Thanks, Justin

On 18 July 2013 12:06, Justin Cappos <jcappos@poly.edu> wrote:
Sorry for any confusion about this. We will provide a bunch of other information soon (should we do this as a PEP?) along with example metadata and working code. We definitely appreciate any feedback.
It's probably too early for a PEP (since we already have way too many other things in motion for people to sensibly keep track of), but this certainly sounds promising - a post summarising your efforts to date would be really helpful. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Okay, we'll get this together once Trishank returns and we've had a chance to write up the latest. Justin On Wed, Jul 17, 2013 at 11:52 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Sorry for any confusion about this. We will provide a bunch of other information soon (should we do this as a PEP?) along with example
On 18 July 2013 12:06, Justin Cappos <jcappos@poly.edu> wrote: metadata
and working code. We definitely appreciate any feedback.
It's probably too early for a PEP (since we already have way too many other things in motion for people to sensibly keep track of), but this certainly sounds promising - a post summarising your efforts to date would be really helpful.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Wed, Jul 17, 2013 at 21:46 -0400, Donald Stufft wrote:
As I've mentioned before an online key (as is required by PyPI) means that if someone compromises PyPI they compromise the key. It seems to me that TUF is really designed to handle the case of the Linux distribution (or similar) where you have vetted maintainers who are given a subsection of the total releases. However PyPI does not have vetted authors nor the man power to sign authors keys offline.
If we had a person with a master key present at Pycon conferences, package maintainers could walk up and have their key signed. Given the many activities of the PSF and the community, i don't think it's off-limits. If we have sig-verified installs, there would be an incentive for authors to go for that little effort. best, holger

In my opinion it is a good idea to embed, not just the *name* of the package that your package depends on, but also the public key or public keys that your package requires the depended-upon package to be signed by. There was a time when wheel did this, using Ed25519 keys (which are nice and small so it is easy to embed them directly into the metadata next to things like URLs and Author Names). I don't know if it still does. There's a PEP that mentions JWS signatures: http://www.python.org/dev/peps/pep-0427/ Regards, Zooko

On Jul 17, 2013, at 3:58 PM, zooko <zooko@zooko.com> wrote:
In my opinion it is a good idea to embed, not just the *name* of the package that your package depends on, but also the public key or public keys that your package requires the depended-upon package to be signed by.
The problem with this is it makes it more difficult to replace a library with a patched copy. Example: I want to install the library Foo, Foo depends on Bar, and Bar depends on Broken. Broken is well, broken and I want to use a patched version of it locally. So I fix Broken, upload it to my private index server and I pip install from that. If public keys are encoded as part of the dependency chain, not only do I need to patch Broken but I also need to patch Foo and Bar _and_ anything else that depends on Foo, Bar, or Broken _and_ anything else that depends on those, so on until we reach the leaves. Packages should have signatures. Dependency should be by name. End tooling should provide a method to make a set of requirements with certain signatures or hashes for a specific instance of this installation. (E.g. Awesome, Inc could have a set of requirements that contain Foo, Bar and their own patched version of Broken along with the keys used to sign all of them). ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Wheel provides a "wheel keygen" and "wheel sign" command and if you set WHEEL_TOOL=/path/to/wheel then bdist_wheel will automatically sign all the packages you create. Ideally wheel would sign every package, reducing the problem from "how do we force people to use PGP" to "how do we derive value from existing signatures." It also allows multiple signers per package. Readers of this list mostly use pypi for library management during development. Zooko's use case is different and appropriate for an application publisher. You trust the application publisher and want to get versions of dependencies they trust/tested or that are signed by people that they trust. As an end user you do not want parties unknown to "fix" dependencies. In any case it wasn't ever expected that people would embed keys in setup.py's abstract dependencies, rather they would go into requirements.txt used to install complete applications. You would also have had the option to trust any number of signing keys (n signers out of m possible signers, likely at a minimum both the publisher's and your own signing key would be accepted for any particular package). There has been a focus on deciding whether a package is malicious. I think that's wrong / too hard. It's better to focus on making sure everyone at least gets the same packages so targeted attacks via the pypi system don't work. I also feel it's much more important to make signatures widespread than to make them individually as secure as possible. On Wed, Jul 17, 2013 at 4:14 PM, Donald Stufft <donald@stufft.io> wrote:
On Jul 17, 2013, at 3:58 PM, zooko <zooko@zooko.com> wrote:
In my opinion it is a good idea to embed, not just the *name* of the package that your package depends on, but also the public key or public keys that your package requires the depended-upon package to be signed by.
The problem with this is it makes it more difficult to replace a library with a patched copy.
Example: I want to install the library Foo, Foo depends on Bar, and Bar depends on Broken. Broken is well, broken and I want to use a patched version of it locally. So I fix Broken, upload it to my private index server and I pip install from that.
If public keys are encoded as part of the dependency chain, not only do I need to patch Broken but I also need to patch Foo and Bar _and_ anything else that depends on Foo, Bar, or Broken _and_ anything else that depends on those, so on until we reach the leaves.
Packages should have signatures. Dependency should be by name. End tooling should provide a method to make a set of requirements with certain signatures or hashes for a specific instance of this installation. (E.g. Awesome, Inc could have a set of requirements that contain Foo, Bar and their own patched version of Broken along with the keys used to sign all of them).
----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
participants (11)
-
Christian Heimes
-
Daniel Holth
-
Donald Stufft
-
holger krekel
-
Jannis Leidel
-
Justin Cappos
-
Nick Coghlan
-
Ronald Oussoren
-
Trishank Karthik Kuppusamy
-
Vinay Sajip
-
zooko