On Wed, Dec 31, 2014 at 2:26 AM, Donald Stufft email@example.com wrote:
On Dec 10, 2014, at 10:16 PM, Vladimir Diaz firstname.lastname@example.org wrote:
I am a research programmer at the NYU School of Engineering. My colleagues (Trishank Kuppusamy and Justin Cappos) and I are requesting community feedback on our proposal, "Surviving a Compromise of PyPI." The two-stage proposal can be reviewed online at:
Summary of the Proposal:
"Surviving a Compromise of PyPI" proposes how the Python Package Index (PyPI) can be amended to better protect end users from altered or malicious packages, and to minimize the extent of PyPI compromises against affected users. The proposed integration allows package managers such as pip to be more secure against various types of security attacks on PyPI and defend end users from attackers responding to package requests. Specifically, these PEPs describe how PyPI processes should be adapted to generate and incorporate repository metadata, which are signed text files that describe the packages and metadata available on PyPI. Package managers request (along with the packages) the metadata on PyPI to verify the authenticity of packages before they are installed. The changes to PyPI and tools will be minimal by leveraging a library, The Update Framework https://github.com/theupdateframework/tuf, that generates and transparently validates the relevant metadata.
The first stage of the proposal (PEP 458 http://legacy.python.org/dev/peps/pep-0458/) uses a basic security model that supports verification of PyPI packages signed with cryptographic keys stored on PyPI, requires no action from developers and end users, and protects against malicious CDNs and public mirrors. To support continuous delivery of uploaded packages, PyPI administrators sign for uploaded packages with an online key stored on PyPI infrastructure. This level of security prevents packages from being accidentally or deliberately tampered with by a mirror or a CDN because the mirror or CDN will not have any of the keys required to sign for projects.
The second stage of the proposal (PEP 480 http://legacy.python.org/dev/peps/pep-0480/) is an extension to the basic security model (discussed in PEP 458) that supports end-to-end verification of signed packages. End-to-end signing allows both PyPI and developers to sign for the packages that are downloaded by end users. If the PyPI infrastructure were to be compromised, attackers would be unable to serve malicious versions of these packages without access to the project's developer key. As in PEP 458, no additional action is required by end users. However, PyPI administrators will need to periodically (perhaps every few months) sign metadata with an offline key. PEP 480 also proposes an easy-to-use key management solution for developers, how to interface with a potential build farm on PyPI infrastructure, and discusses the security benefits of end-to-end signing. The second stage of the proposal simultaneously supports real-time project registration and developer signatures, and when configured to maximize security on PyPI, less than 1% of end users will be at risk even if an attacker controls PyPI and goes undetected for a month.
We thank Nick Coghlan and Donald Stufft for their valuable contributions, and Giovanni Bajo and Anatoly Techtonik for their feedback.
I’ve just finished (re)reading the white paper, PEP 450, PEP 480, and some of the supporting documentation on the TUF website.
I’m confused about what exactly is contained within the TUF metadata and who signs what in a PEP 480 world.
The following illustration shows what is contained within TUF metadata (JSON files): https://github.com/vladimir-v-diaz/pep-on-pypi-with-tuf/raw/master/pep-0458/... Note: In this illustration, the "snapshot" and "targets" roles are renamed "release" and "projects", respectively.
If you're interested in what exactly is contained in these JSON files, here is example metadata: https://github.com/theupdateframework/tuf/tree/develop/examples/repository/m...
In a PEP 480 world, project developers sign a single JSON file. For example, developer(s) for the "Request" project sign their assigned JSON file named "/targets/claimed/Requests.json". Specifically, a signature is generated of the "signed" entry https://github.com/theupdateframework/tuf/blob/develop/examples/repository/metadata/targets.json#L9-L49 of the dictionary. Once the signature is generated, it is added to the "signatures" entry https://github.com/theupdateframework/tuf/blob/develop/examples/repository/metadata/targets.json#L2-L7 of the JSON file.
In figure 1 of PEP 480, PyPI signs for every metadata except those listed under the "roles signed by developer keys" label: https://github.com/vladimir-v-diaz/pep-maximum-security-model/blob/master/pe...
Currently when you do something like ``pip install FooBar``, pip fetches /simple/FooBar/ to look for potential installation candidates, and when it finds one it downloads it and installs it. This all all “signed” by online keys via TLS.
- In a TUF world, would pip still fetch /simple/FooBar/ to discover
things to install or would it fetch some TUF metadata to find things to install?
In the integration/demo we did with pip, we treated each /simple/ html file as a target (listed the hash and file size of these html index pages in TUF metadata). That is, pip still fetched /simple/FooBar/ to discover distributions to install, but we verified the html files *and* distributions against TUF metadata. In PEP 458, we state that "/simple" is also listed in TUF metadata: http://legacy.python.org/dev/peps/pep-0458/#pypi-and-tuf-metadata (last paragraph just before the diagram).
Another option is to avoid crawling/listing the simple index pages and just search TUF metadata for distributions, but this approach will require design changes to pip. We went with the approach (treat the index pages as targets) that required minimal changes to pip.
- If it’s fetching /simple/FooBar/ is that secured by TUF?
Yes, see my response to (1).
3. If it’s secured by TUF who signs the TUF metadata that talks about
/simple/FooBar/ in PEP 480 the author or PyPI?
PEP 480 authors sign for both their project's index page and distribution(s) (as indicated in the JSON file):
"A claimed or recently-claimed project will need to upload in its transaction to PyPI not just targets (a simple index as well as distributions) but also TUF metadata. The project MAY do so by uploading a ZIP file containing two directories, /metadata/ (containing delegated targets metadata files) and /targets/ (containing targets such as the project simple index and distributions that are signed by the delegated targets metadata)."
See the second paragraph of http://legacy.python.org/dev/peps/pep-0480/#snapshot-process.
Let me know exactly what needs to change in the PEPs to make everything explained above clearer. For example, in PEP 458 we provide a link/reference https://www.python.org/dev/peps/pep-0458/#what-additional-repository-files-are-required-on-pypi (last paragraph of this subsection) to the Metadata document https://github.com/theupdateframework/tuf/blob/develop/METADATA.md indicating the content of the JSON files, but should the illustration https://github.com/vladimir-v-diaz/pep-on-pypi-with-tuf/raw/master/pep-0458/figure4.pdf I've included in this reply also be added?
Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA