On 7 November 2016 at 07:20, Chris Barker
So how is allowing anyone to push something to PyPi that will run arbitrary code on a CI server, that will push arbitrary code to PyPi that will then get run by anyone that pip installs it?
PyPI currently has the ability to impersonate any PyPI publisher, which makes it an enormous security threat in and of itself, so we need to limit the attack surfaces that it exposes.
Essentially, we have already said that there is no such thing as "trusting PyPi" -- you need to trust each individual package. So how in any sort of auto-build system going to change that??
Currently we're reasonably confident that the only folks that can compromise Django users (for example) are the Django devs and the PyPI service administrators. The former is an inherent problem in trusting any software publisher, while the latter we currently mitigate by tightly controlling admin access to the production PyPI service, and strictly limiting the server-side processing that PyPI performs on uploaded files to reduce the opportunities for privilege escalation attacks. Once you start providing a server-side build service however, you're opening up additional attack vectors on the core publishing system, and getting any aspect of that wrong may lead to publishers being able to impersonate *each other*. Unfortunately, offering secure multi-tenancy in software services when you allow tenants to run arbitrary code is a really hard problem - it's the main reason that OpenShift v3 hasn't fully displaced OpenShift v2 yet, and that's with the likes of Red Hat, Google, CoreOS and Deis collaborating on the underlying Kubernetes infrastructure. Linux distros and conda-forge duck that multi-tenancy problem by treating the build system itself as the publisher, with everyone with access to it being a relatively trusted co-tenant (think "share house with no locks on interior doors" rather than "apartment complex"). That approach works OK at smaller scales, but the gatekeeping involved in approving new co-publishers introduces off-putting friction for potential participants (hence both app developers and data analysts finding ways to bypass the sysadmin and OS developer dominated Linux packaging ecosystems). For PyPI, we can mitigate the difficulty by getting the builds to happen somewhere else (like external CI services), but even then you still have a non-trivial service integration problem to manage, especially if you decide to tackle it through a "bring your own build service" approach (ala GitHub CI integration). Whichever way you go though (native build service, or integration with external build services), you're signing up for a major ongoing maintenance task, as you're either now responsible for a shared build system serving tens of thousands of software publishers [1], or else you're responsible for maintaining a coherent publisher UX while also maintaining compatibility with multiple external systems that you don't directly control. Cheers, Nick. [1] There were ~35k distinct publisher accounts on PyPI when Donald last checked in August: https://github.com/pypa/warehouse/issues/1428 -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia