[Distutils] File integrity checking and host blocking for EasyInstall

Phillip J. Eby pje at telecommunity.com
Mon Aug 15 01:06:38 CEST 2005

After thinking over the last week's distutils-sig discussion about 
security, signatures, etc., I think I have a plan for handling basic file 
integrity checking and (non-cryptographic) trust management for 
EasyInstall.  It is not a high-security end-to-end solution, but I think it 
will allow security-conscious persons to take a more "locked down" approach 
if they want to, while providing everyone else with some baseline 
protection against corrupted files.

The first part of the plan is to add md5 digest checking to 
EasyInstall.  Because one of EasyInstall's design goals is to make it easy 
for anybody to publish links to packages, we need to be able to include the 
md5 signature in a package's URL.  I'm thinking we could achieve this via 
an '#md5=...' fragment identifier.  For example, a setuptools source 
archive URL might be:


The advantage of this approach is that it allows anyone to assert what the 
md5 of the targeted file is, and it can be asserted in any web page, just 
by pointing an HREF at the file.  EasyInstall could detect the '#md5=' 
marker, and then use this to verify the file during download.

The disadvantage, of course, is that PyPI doesn't currently support this; 
it creates a separate link to a page that displays the md5, and that URL 
doesn't contain anything that connects it back to the distribution file it 
refers to.  I could probably create some kind of parsing hack to fix that 
for PyPI, but it seems it might be worth adding the #md5 trick to PyPI to 
support this.

EasyInstall would also need to grow a --require-md5 option, which would 
refuse to install anything from a Subversion checkout or a distribution 
without a known md5 signature.

In addition to md5 support in EasyInstall, I propose to also add it to 
ez_setup; there, however, the md5 values for various distributions will be 
hardcoded into ez_setup.py itself.  (I'll make my "release" script append 
the md5 signatures for new distributions to the end of ez_setup.py.)  In 
this way, the bootstrap installation of setuptools can also be reasonably 
secured, as long as you trust a particular version of ez_setup.py.

The next part of the plan would be to add an --allow-hosts option to 
EasyInstall.  This would be a list of host wildcards that EasyInstall would 
be allowed to contact.  For example, --allow-hosts=*.python.org would let 
EasyInstall download or scan pages from PyPI or www.python.org, but not 
anywhere else.  The default, if not specified, would be '*', meaning that 
any host may be accessed.  If EasyInstall finds itself about to download a 
page or distribution from a host that isn't allowed, it will abort with a 
message explaining the problem.

This would allow folks like Paul Moore to configure a default --allow-hosts 
list in their pydistutils.cfg, to prevent EasyInstall from downloading 
things from just any old place on the Internet.  Once he's verified that he 
trusts a particular site, he can edit pydistutils.cfg and add it, or else 
manually download the blocked URL, publish it on a trusted intranet host, etc.

So, this is not a complete security solution, as it doesn't deal with 
end-to-end file integrity, and could easily be subverted by taking over a 
site somewhere in the middle (e.g. python.org).  But until we have more of 
the cryptographic infrastructure in place, I think this plan could provide 
us with a good starting point.  Comments, anyone?

More information about the Distutils-SIG mailing list