[Catalog-sig] Proposal: close the PyPI file-replacement loophole

Donald Stufft donald.stufft at gmail.com
Wed Feb 1 12:10:49 CET 2012


I should mention that this scenario is the worst case scenario, and isn't the likely one. However I think the possible damages from it well outweigh the small amount of benefit from mutable packages. 

The more likely scenarios (on the failure side) are either applications breaking upon install/deploy or silently corrupting data. Both of which, but especially the silently corrupting data case I think the possible damages again outweigh the small benefit from mutable packages. 


On Wednesday, February 1, 2012 at 6:06 AM, Yuval Greenfield wrote:

> On Wed, Feb 1, 2012 at 11:14 AM, Chris Withers <chris at simplistix.co.uk (mailto:chris at simplistix.co.uk)> wrote:
> > On 01/02/2012 09:01, Yuval Greenfield wrote:
> > > Would you testify that HTTP is secure because I can emulate TLS in
> > > javascript?
> > 
> > What's that got to do with the price of eggs?
> > 
> > 
> 
> I can maintain my own personal log of package SHA-512's and thus locally avoid this security hole in PyPI. The system as a whole is still vulnerable by default. That's why HTTP is considered insecure even though you can build secure solutions on top of it. PyPI would be the insecure infrastructure upon which secure frameworks can be built. Immutability will make the default behavior of pypi more secure. 
> 
>  
> > > PyPI should do what it can within reason to be consistent and safe for
> > > all its users.
> > 
> > *sigh* that's what the MD5s are for. What threat, exactly are you so worried about here? That someone investigates and chooses to use a package, and then, having done so, decides to re-download an identical version of that package which has been maliciously uploaded, and happens to have the same MD5 checksum as the one they've already downloaded?
> > 
> 
> Let's assume I made a package called pybanker that requires a specific version of SQLAlchemy (eg 0.6.8). When I tell people to download SQLAlchemy 0.6.8 do I have to tell them the exact hash? Does the setup.py/cfg (http://setup.py/cfg) allow me to require a specific hash on SQLAlchemy when automatically resolving dependencies in pip/easy_install? So now when banks around the world are going to use pybanker and thus SQLAlchemy 0.6.8 - they don't know what was the original hash. In the meantime a security threat has manifested in sqlalchemy (eg pythonpackages was hacked, an sqlalchemy maintainer password/computer/network was compromised, etc). The hacker modifies SQLAlchemy 0.6.8 to work perfectly while adding a backdoor to the system or relaying all transactions to a remote server. 
> 
> I hope we don't have to wait for this attack vector to be used (and detected and publicized) before this loophole is patched.
> 
> Obviously this isn't the only problem if the account of an SQLAlchemy maintainer is compromised - other threats can manifest as well. That doesn't mean this specific threat should be ignored, especially considering that it's a stealthy vector. 
> 
> tl;dr - the classic bait and switch is why user generated content is immutable on most web services. If edits are allowed they are usually only so for a limited amount of time or require an administrator in the loop. 
> 
> Yuval 
> _______________________________________________
> Catalog-SIG mailing list
> Catalog-SIG at python.org (mailto:Catalog-SIG at python.org)
> http://mail.python.org/mailman/listinfo/catalog-sig
> 
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/catalog-sig/attachments/20120201/8db024f0/attachment.html>


More information about the Catalog-SIG mailing list