[Catalog-sig] Proposal: close the PyPI file-replacement loophole

Yuval Greenfield ubershmekel at gmail.com
Wed Feb 1 12:06:18 CET 2012

On Wed, Feb 1, 2012 at 11:14 AM, Chris Withers <chris at simplistix.co.uk>wrote:

> On 01/02/2012 09:01, Yuval Greenfield wrote:
>> Would you testify that HTTP is secure because I can emulate TLS in
>> javascript?
> What's that got to do with the price of eggs?
I can maintain my own personal log of package SHA-512's and thus locally
avoid this security hole in PyPI. The system as a whole is still vulnerable
by default. That's why HTTP is considered insecure even though you can
build secure solutions on top of it. PyPI would be the insecure
infrastructure upon which secure frameworks can be built. Immutability will
make the default behavior of pypi more secure.

>  PyPI should do what it can within reason to be consistent and safe for
>> all its users.
> *sigh* that's what the MD5s are for. What threat, exactly are you so
> worried about here? That someone investigates and chooses to use a package,
> and then, having done so, decides to re-download an identical version of
> that package which has been maliciously uploaded, and happens to have the
> same MD5 checksum as the one they've already downloaded?
Let's assume I made a package called pybanker that requires a specific
version of SQLAlchemy (eg 0.6.8). When I tell people to
download SQLAlchemy 0.6.8 do I have to tell them the exact hash? Does the
setup.py/cfg allow me to require a specific hash on SQLAlchemy when
automatically resolving dependencies in pip/easy_install? So now when banks
around the world are going to use pybanker and thus SQLAlchemy 0.6.8 - they
don't know what was the original hash. In the meantime a security threat
has manifested in sqlalchemy (eg pythonpackages was hacked, an sqlalchemy
maintainer password/computer/network was compromised, etc). The hacker
modifies SQLAlchemy 0.6.8 to work perfectly while adding a backdoor to the
system or relaying all transactions to a remote server.

I hope we don't have to wait for this attack vector to be used (and
detected and publicized) before this loophole is patched.

Obviously this isn't the only problem if the account of an SQLAlchemy
maintainer is compromised - other threats can manifest as well. That
doesn't mean this specific threat should be ignored, especially considering
that it's a stealthy vector.

tl;dr - the classic bait and switch is why user generated content is
immutable on most web services. If edits are allowed they are usually only
so for a limited amount of time or require an administrator in the loop.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20120201/fd5ddc26/attachment.html>

More information about the Catalog-SIG mailing list