[Catalog-sig] A modest proposal for securing PyPI with TUF

Nick Coghlan ncoghlan at gmail.com
Thu Mar 14 08:03:00 CET 2013


On Wed, Mar 13, 2013 at 11:58 AM, Justin Cappos <jcappos at poly.edu> wrote:
> We use the simple directory and filenames because that is what pip uses.
>
> You have a nice suggestion to include other metadata in the TUF metadata.
> We certainly could do this if desirable.   This required a redesign of the
> PyPI API and we weren't sure if this was wanted.   Our current doc /
> prototype is trying to minimize the changes needed all around.

I think what you currently propose (signing the metadata pip already
understands) is a good first step, especially if we can have PyPI
signing *all* the target metadata in the initial deployment, and defer
the delegation to package developers until the next phase of the
rollout (we obviously want to do that eventually, but it's easier if
we can get a preliminary version working without needing to change the
upload tools).

While such an approach doesn't immediately give us the end-to-end
security we ultimately want to set up, it means a few things become
possible:
1. Rather than requiring every developer to start signing end-to-end
metadata immediately, we can ask a few major projects (e.g. Django,
Zope, NumPy) if they're willing to serve as guinea pigs for the
developer target signing delegations. Once we're happy the signing
process is usable, we can make it generally available as an option to
projects (while also allowing them to continue with PyPI's existing
upload mechanisms and only offer PyPI-user integrity checks rather
than developer-user)
2. Gives the PSF infrastructure team and the PyPI maintainers a chance
to work with the installation tool developers to get the PyPI-user
link sorted out, before needing to work on the developer-PyPI link
3. Considering alternate mirroring solutions based on replicating the
TUF metadata rather than PEP 381

Eventually I would also like to tunnel a subset of the PEP 426
metadata through TUF's "custom" fields, but again, I think we're
better off skipping that for the first iteration. Incremental
enhancements are a good thing :)

Regards,
Nick.

>
> Thanks,
> Justin
>
>
> On Wed, Mar 13, 2013 at 2:15 PM, Daniel Holth <dholth at gmail.com> wrote:
>>
>> On Wed, Mar 13, 2013 at 5:13 AM, Trishank Karthik Kuppusamy
>> <tk47 at students.poly.edu> wrote:
>> > Hello Nick,
>> >
>> >
>> > On 3/13/13 4:09 AM, Nick Coghlan wrote:
>> >>
>> >>
>> >> - the PSF board generally stays out of the technical details of
>> >> running the python.org infrastructure, so it's likely that any root
>> >> keys would be handled by the PSF infrastructure committee. A (2, 4) or
>> >> (3, 5) trust configuration would likely be manageable at this level.
>> >
>> >
>> > Understood. We think a higher (t, n) [where t out of n signatures are
>> > needed
>> > to trust the metadata for a role] is better for the root role simply
>> > because
>> > its crucial metadata (the authorized keys for top-level roles) should
>> > change
>> > very rarely.
>> >
>> >
>> >> - at the target delegation level, PyPI supports the registration of
>> >> new projects through the web service (see
>> >> http://docs.python.org/2/distutils/packageindex.html). If my
>> >> understanding of target delegation is correct, this means the "simple"
>> >> and "packages/source/<letter>" delegations will need to be (1, 1) and
>> >> online.
>> >> - higher levels of the target delegation hierarchy could conceivably
>> >> be kept offline, but there seems little value in doing so if they're
>> >> trusting on online (1, 1) key
>> >
>> >
>> > Fortunately, the "targets/simple" and
>> > "targets/packages/(version)/(letter)/"
>> > roles should not require (1, 1) online keys, as their metadata (simply
>> > target delegations and no actual target files) should also fluctuate
>> > fairly
>> > rarely. I should make this clearer in our design document.
>> >
>> >
>> >> - many PyPI packages are maintained by single developers, so (1, 1) or
>> >> (1, n) is likely to be the only generally feasible level of signing at
>> >> the project level.
>> >
>> >
>> > Yes, the package developers themselves could choose any (t, n) they
>> > like. In
>> > our design, we propose that PyPI could eventually delegate to "stable"
>> > packages which need little change (and use more security with more
>> > offline
>> > keys) and to "unstable" packages which need frequent change (and use
>> > less
>> > security with more online keys).
>> >
>> >
>> >> With the current focus being on getting an improvement from the status
>> >> quo that we can successfully deploy in a reasonable period of time,
>> >> the target delegation side of things probably needs to be
>> >> substantially simpler in the initial iteration. Yes, it leaves us open
>> >> to certain vulnerabilities we would like to remove in the long run,
>> >> but we need to be very cautious in the additional demands we place on
>> >> the users uploading to PyPI. It may even mean the initial iteration
>> >> allows projects to rely on a PyPI provided signing key for their TUF
>> >> metadata, using the existing upload mechanisms to add the files to
>> >> PyPI.
>> >
>> >
>> > I agree that there is a delicate problem of balancing security with
>> > usability here, especially in the beginning.
>> >
>> > You raised a very good issue there: on first migration, how would PyPI
>> > accommodate packages which have not had their target files delegated to
>> > their developers? We imagine that in this case, PyPI could assume
>> > initial
>> > responsibility for these packages, and later PyPI would delegate those
>> > packages to their respective developers.
>> >
>> > Thanks for your input,
>> > Trishank
>>
>> With all the different kinds of metadata, It's interesting to note
>> that currently TUF seems to only be concerned with the available file
>> names and their integrity. (Some of us will think of PEP 426
>> "PKG-INFO" first when we hear the word metadata.)
>>
>> It looks like the D metadata lists all the filenames for Django, and
>> then Django lists them again with hashes and signatures. Why all the
>> lists? Does every Django release re-assert all the versions of Django
>> that are available on the index?
>>
>> How might I deal with producing the official source distribution
>> myself and having a friend produce the official Windows build of a
>> package?
>>
>> As an aside PyPI has been doubling in size every 1.5 - 2 years.
>>
>> Thanks
>>
>> Daniel Holth
>
>



-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Catalog-SIG mailing list