[Distutils] File integrity checking and host blocking for EasyInstall
Phillip J. Eby
pje at telecommunity.com
Mon Aug 15 01:06:38 CEST 2005
After thinking over the last week's distutils-sig discussion about
security, signatures, etc., I think I have a plan for handling basic file
integrity checking and (non-cryptographic) trust management for
EasyInstall. It is not a high-security end-to-end solution, but I think it
will allow security-conscious persons to take a more "locked down" approach
if they want to, while providing everyone else with some baseline
protection against corrupted files.
The first part of the plan is to add md5 digest checking to
EasyInstall. Because one of EasyInstall's design goals is to make it easy
for anybody to publish links to packages, we need to be able to include the
md5 signature in a package's URL. I'm thinking we could achieve this via
an '#md5=...' fragment identifier. For example, a setuptools source
archive URL might be:
http://www.python.org/packages/source/s/setuptools/setuptools-0.5a13.zip#md5=91f31a9058330174640a867cf5d4de57
The advantage of this approach is that it allows anyone to assert what the
md5 of the targeted file is, and it can be asserted in any web page, just
by pointing an HREF at the file. EasyInstall could detect the '#md5='
marker, and then use this to verify the file during download.
The disadvantage, of course, is that PyPI doesn't currently support this;
it creates a separate link to a page that displays the md5, and that URL
doesn't contain anything that connects it back to the distribution file it
refers to. I could probably create some kind of parsing hack to fix that
for PyPI, but it seems it might be worth adding the #md5 trick to PyPI to
support this.
EasyInstall would also need to grow a --require-md5 option, which would
refuse to install anything from a Subversion checkout or a distribution
without a known md5 signature.
In addition to md5 support in EasyInstall, I propose to also add it to
ez_setup; there, however, the md5 values for various distributions will be
hardcoded into ez_setup.py itself. (I'll make my "release" script append
the md5 signatures for new distributions to the end of ez_setup.py.) In
this way, the bootstrap installation of setuptools can also be reasonably
secured, as long as you trust a particular version of ez_setup.py.
The next part of the plan would be to add an --allow-hosts option to
EasyInstall. This would be a list of host wildcards that EasyInstall would
be allowed to contact. For example, --allow-hosts=*.python.org would let
EasyInstall download or scan pages from PyPI or www.python.org, but not
anywhere else. The default, if not specified, would be '*', meaning that
any host may be accessed. If EasyInstall finds itself about to download a
page or distribution from a host that isn't allowed, it will abort with a
message explaining the problem.
This would allow folks like Paul Moore to configure a default --allow-hosts
list in their pydistutils.cfg, to prevent EasyInstall from downloading
things from just any old place on the Internet. Once he's verified that he
trusts a particular site, he can edit pydistutils.cfg and add it, or else
manually download the blocked URL, publish it on a trusted intranet host, etc.
So, this is not a complete security solution, as it doesn't deal with
end-to-end file integrity, and could easily be subverted by taking over a
site somewhere in the middle (e.g. python.org). But until we have more of
the cryptographic infrastructure in place, I think this plan could provide
us with a good starting point. Comments, anyone?
More information about the Distutils-SIG
mailing list